Merge lp:~camptocamp/banking-addons/improve_lookup into lp:banking-addons/bank-statement-reconcile-70

Proposed by Nicolas Bessi - Camptocamp
Status: Merged
Merged at revision: 88
Proposed branch: lp:~camptocamp/banking-addons/improve_lookup
Merge into: lp:banking-addons/bank-statement-reconcile-70
Diff against target: 125 lines (+47/-36)
1 file modified
account_statement_base_completion/statement.py (+47/-36)
To merge this branch: bzr merge lp:~camptocamp/banking-addons/improve_lookup
Reviewer Review Type Date Requested Status
Alexandre Fayolle - camptocamp Approve
Review via email: mp+154730@code.launchpad.net

Description of the change

Fixes performance trouble when using bank_statement_label based completion rules by using memoizer pattern.

Add lines in context to be able to acces them in completion rules. It is not mandatory as we can do line.satement_id.line_ids but it is more efficient.

Some minor cleanup

To post a comment you must log in.
Revision history for this message
Alexandre Fayolle - camptocamp (alexandre-fayolle-c2c) wrote :

My main concern is the generation of an invalid regular expression on lines 30-31 : what if the partner.bank_statement_label contains "special" characters such as .[]()^$+*? this could lead to a crash in the query.

Unless we are really really sure this is not possible, there should be some escaping performed on this string before converting it to a regex. I think there is a re.escape function available in Python to do just this.

line 37-38: useless, please remove

line 23: is context['label_memoizer'] used outside this function? If not, this could be simply a local variable.

line 101: get is a method -> use () instead of [] (or if you're sure the key is in there, just [] without get)

review: Needs Fixing (code review, no test)
Revision history for this message
Alexandre Fayolle - camptocamp (alexandre-fayolle-c2c) wrote :

> My main concern is the generation of an invalid regular expression on lines
> 30-31 : what if the partner.bank_statement_label contains "special" characters
> such as .[]()^$+*? this could lead to a crash in the query.
>
> Unless we are really really sure this is not possible, there should be some
> escaping performed on this string before converting it to a regex. I think
> there is a re.escape function available in Python to do just this.
>
> line 37-38: useless, please remove
>
> line 23: is context['label_memoizer'] used outside this function? If not,
> this could be simply a local variable.
>
> line 101: get is a method -> use () instead of [] (or if you're sure the key
> is in there, just [] without get)

and l21: s/Follwing/Following/

Revision history for this message
Nicolas Bessi - Camptocamp (nbessi-c2c-deactivatedaccount) wrote :

Hello,

Thank for your comments

I think I will add constraints on the bank_label_field this will be more explicit to the end user. What do you think.

line 23: The memoizer is passed trough each line of the statement it is not locally scoped as the function is called for each line of the statement. That's why the previous implementation was innefficient.

line 101. Tho.. finger crossed, I wanted to remove the get usage. That why we made code review ;)

Revision history for this message
Nicolas Bessi - Camptocamp (nbessi-c2c-deactivatedaccount) wrote :

After a discution with Frederic this pattern are widely used. So I think I'll use ilike or is similar to expression

Revision history for this message
Nicolas Bessi - Camptocamp (nbessi-c2c-deactivatedaccount) wrote :

Ok we have a misunderstanding with Fréderic about the specifications.
re.escape will do the trick.

Regards

Revision history for this message
Nicolas Bessi - Camptocamp (nbessi-c2c-deactivatedaccount) wrote :

Hello,
Proposed fixes added

Revision history for this message
Alexandre Fayolle - camptocamp (alexandre-fayolle-c2c) wrote :

LGTM

On a totally minor side, I'd prefer "import re" and using "re.escape" the code; The reason being that there are several "escape" functions in the stdlib (for xml, cgi, shell command lines, re...) and it is easier for the reader when coming on the code to understand the context with a namespaced call.

review: Approve
90. By Nicolas Bessi - Camptocamp

[IMP] import readability

Revision history for this message
Nicolas Bessi - Camptocamp (nbessi-c2c-deactivatedaccount) wrote :

Improve import

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'account_statement_base_completion/statement.py'
2--- account_statement_base_completion/statement.py 2013-03-01 13:33:03 +0000
3+++ account_statement_base_completion/statement.py 2013-03-22 08:33:26 +0000
4@@ -18,6 +18,9 @@
5 # along with this program. If not, see <http://www.gnu.org/licenses/>.
6 #
7 ##############################################################################
8+from collections import defaultdict
9+import re
10+
11 from tools.translate import _
12 from openerp.osv.orm import Model, fields
13 from openerp.tools import DEFAULT_SERVER_DATETIME_FORMAT
14@@ -269,35 +272,44 @@
15 """
16 partner_obj = self.pool.get('res.partner')
17 st_obj = self.pool.get('account.bank.statement.line')
18+ res = {}
19+ # As we have to iterate on each partner for each line,
20+ # we memoize the pair to avoid
21+ # to redo computation for each line.
22+ # Following code can be done by a single SQL query
23+ # but this option is not really maintanable
24+ if not context.get('label_memoizer'):
25+ context['label_memoizer'] = defaultdict(list)
26+ partner_ids = partner_obj.search(cr,
27+ uid,
28+ [('bank_statement_label', '!=', False)])
29+ line_ids = tuple(x.id for x in context.get('line_ids', []))
30+ for partner in partner_obj.browse(cr, uid, partner_ids, context=context):
31+ vals = '|'.join(re.escape(x.strip()) for x in partner.bank_statement_label.split(';'))
32+ or_regex = ".*%s*." % vals
33+ sql = ("SELECT id from account_bank_statement_line"
34+ " WHERE id in %s"
35+ " AND name ~* %s")
36+ cr.execute(sql, (line_ids, or_regex))
37+ pairs = cr.fetchall()
38+ for pair in pairs:
39+ context['label_memoizer'][pair[0]].append(partner)
40 st_line = st_obj.browse(cr, uid, line_id, context=context)
41- res = {}
42- compt = 0
43- if st_line:
44- ids = partner_obj.search(
45- cr,
46- uid,
47- [('bank_statement_label', '!=', False)],
48- context=context)
49- for partner in partner_obj.browse(cr, uid, ids, context=context):
50- for partner_label in partner.bank_statement_label.split(';'):
51- if partner_label in st_line.label:
52- compt += 1
53- res['partner_id'] = partner.id
54- if compt > 1:
55- raise ErrorTooManyPartner(
56- _('Line named "%s" (Ref:%s) was matched by '
57- 'more than one partner.') %
58- (st_line.name, st_line.ref))
59- if res:
60- st_vals = st_obj.get_values_for_line(
61- cr,
62- uid,
63- profile_id=st_line.statement_id.profile_id.id,
64- partner_id=res.get('partner_id', False),
65- line_type=st_line.type,
66- amount=st_line.amount,
67- context=context)
68- res.update(st_vals)
69+ if st_line and st_line.id in context['label_memoizer']:
70+ found_partner = context['label_memoizer'][st_line.id]
71+ if len(found_partner) > 1:
72+ raise ErrorTooManyPartner(_('Line named "%s" (Ref:%s) was matched by '
73+ 'more than one partner.') %
74+ (st_line.name, st_line.ref))
75+ res['partner_id'] = found_partner[0].id
76+ st_vals = st_obj.get_values_for_line(cr,
77+ uid,
78+ profile_id=st_line.statement_id.profile_id.id,
79+ partner_id=found_partner[0].id,
80+ line_type=st_line.type,
81+ amount=st_line.amount,
82+ context=context)
83+ res.update(st_vals)
84 return res
85
86 def get_from_label_and_partner_name(self, cr, uid, line_id, context=None):
87@@ -322,24 +334,22 @@
88 st_line = st_obj.browse(cr, uid, line_id, context=context)
89 if st_line:
90 sql = "SELECT id FROM res_partner WHERE name ~* %s"
91- pattern = ".*%s.*" % st_line.label
92+ pattern = ".*%s.*" % re.escape(st_line.label)
93 cr.execute(sql, (pattern,))
94 result = cr.fetchall()
95 if not result:
96 return res
97 if len(result) > 1:
98- raise ErrorTooManyPartner(
99- _('Line named "%s" (Ref:%s) was matched by more '
100- 'than one partner.') %
101- (st_line.name, st_line.ref))
102- for id in result[0]:
103- res['partner_id'] = id
104+ raise ErrorTooManyPartner(_('Line named "%s" (Ref:%s) was matched by more '
105+ 'than one partner.') %
106+ (st_line.name, st_line.ref))
107+ res['partner_id'] = result[0][0] if result else False
108 if res:
109 st_vals = st_obj.get_values_for_line(
110 cr,
111 uid,
112 profile_id=st_line.statement_id.profile_id.id,
113- partner_id=res.get('partner_id', False),
114+ partner_id=res['partner_id'],
115 line_type=st_line.type,
116 amount=st_line.amount,
117 context=context)
118@@ -475,6 +485,7 @@
119 for stat in self.browse(cr, uid, ids, context=context):
120 msg_lines = []
121 ctx = context.copy()
122+ ctx['line_ids'] = stat.line_ids
123 for line in stat.line_ids:
124 res = {}
125 try:

Subscribers

People subscribed via source and target branches