Merge lp:~adeuring/launchpad/bug-1020443 into lp:launchpad
| Status: | Merged |
|---|---|
| Approved by: | j.c.sackett on 2012-07-19 |
| Approved revision: | 15588 |
| Merge reported by: | Abel Deuring |
| Merged at revision: | not available |
| Proposed branch: | lp:~adeuring/launchpad/bug-1020443 |
| Merge into: | lp:launchpad |
| Diff against target: |
328 lines (+191/-30) 2 files modified
database/schema/patch-2209-24-3.sql (+124/-0) lib/lp/services/database/doc/textsearching.txt (+67/-30) |
| To merge this branch: | bzr merge lp:~adeuring/launchpad/bug-1020443 |
| Related bugs: |
| Reviewer | Review Type | Date Requested | Status |
|---|---|---|---|
| j.c.sackett (community) | 2012-07-19 | Approve on 2012-07-19 | |
|
Review via email:
|
|||
Commit Message
Ignore the symbols "&|!" in full text searches; don't treat a leading '-' as the "NOT" operator.
Description of the Change
This branch fixes bug 1020443: Search terms consisting of a tsquery operator
surrounded by punctuation can lead to OOPSes.
This bug is fallout from my previous work on bug 29713,
lp:~adeuring/launchpad/bug-29713-db-dev .
Background: We have a DB procedure ftq() which processes full text
query strings so that they can be passed to the Postgres procedure
to_tsquery() which in turn generates a query object that can be used
in SQL expressions like
SELECT ... FROM bugtaskflat WHERE bugtaskflat.fti @@ query_object;
to_tsquery() supports more complex expressions, including grouping
terms with parentheses and the logical operators '&', '|' and '!',
with a simiar meaning as in C or Python.
It is obviously easy to provide syntactically incorrect search
expressions, so ftq() tries hard to fix possible errors -- but I broke
the proper treatment of "badly placed" logical operators in the branch
mentioned above.
The example query string from bug 1020443, "?!.", shows that users do
not always expect that "&", "|", "!" are treated in any special way,
so I asked in https:/
if we should continue to treat "&|!" in full text searches as operators
or not. The email contains two more examples where typical source code
text pasted as a full text query leads to surprising results, due to
the interpretation of "&|!" as logical operators.
I got one "vote" from Curtis that ignoring "&|!" in search queries
makes sense and none votes againt doing this; since I think too that
using "&|!" as operators causes more confusion that being useful,
I removed this option.
The technical side: to_tsquery() interpret exactly the characters
"&|!" as logical operators; if they apper in text that is indexed,
they are treated like spaces. SO ftq() now replaces them too with
spaces.
To increase "search consistency" a bit more, I also removed the
"feature" that a '-' preceding a word is interpreted as a the "NOT"
operator because it makes it impossible to search for negative
numbers (which are stored including the '-' in the full text index).
Since ftq() is a stored procedure, I added the usual DB patch file.
The new file patch-2209-24-2.sql is an only slightly modified variant
of patch-2209-
+++ database/
@@ -17,11 +17,13 @@
query = args[0]
## plpy.debug('1 query is %s' % repr(query))
+ # Replace tsquery operators with ' '.
+ query = re.sub('[|&!]', ' ', query)
+
# Normalize whitespace
query = re.sub("(?u)\s+"," ", query)
- # Convert AND, OR, NOT and - to tsearch2 punctuation
- query = re.sub(
+ # Convert AND, OR, NOT to tsearch2 punctuation
query = re.sub(
query = re.sub(
query = re.sub(
@@ -34,9 +36,6 @@
query = re.sub(r"(?u)%s+" % (punctuation,), " ", query)
## plpy.debug('3 query is %s' % repr(query))
- # Strip ! characters inside and at the end of a word
- query = re.sub(
-
# Now that we have handle case sensitive booleans, convert to lowercase
query = query.lower()
@@ -122,4 +121,4 @@
return query or None
$_$;
-INSERT INTO LaunchpadDataba
+INSERT INTO LaunchpadDataba
The part "@@ -34,9 +36,6 @@" removes a substitution that is not
longer needed since any "!" symbols are already removed from the
query string.
Since this a "hot patch"
(https:/
-- just the change of a procedure butno schema changes, trigger change,
index change etc --, this MP uses the devel branch as the target.
test: ./bin/test services.database -vvt textsearching.txt
= Launchpad lint =
Checking for conflicts and issues in changed files.
Linting changed files:
database/
lib/lp/
./lib/lp/
734: want exceeds 78 characters.
make: *** [lint] Fehler 1
That's not caused by my changes -- and hard to fix...
| j.c.sackett (jcsackett) wrote : | # |
Erhm, spoke to soon--I didn't see that there appears to be a conflict in need of resolution.
- 15588. By Abel Deuring on 2012-07-19
-
DB patch file renamed.
| j.c.sackett (jcsackett) wrote : | # |
Looks good now, thanks. Remember to update the INSERT INTO LaunchpadDataba
- 15589. By Abel Deuring on 2012-07-19
-
DB Patch number adjusted.
- 15590. By Abel Deuring on 2012-07-26
-
devel merged

This looks good, thanks.