Merge lp:~stub/launchpad/bug-778338-transient-schema into lp:launchpad

Proposed by Stuart Bishop
Status: Work in progress
Proposed branch: lp:~stub/launchpad/bug-778338-transient-schema
Merge into: lp:launchpad
Prerequisite: lp:~stub/launchpad/bug-179821-lowercase-tablenames
Diff against target: 361 lines (+78/-46)
3 files modified
lib/canonical/database/ftests/test_multitablecopy.txt (+26/-14)
lib/canonical/database/multitablecopy.py (+25/-18)
lib/canonical/database/postgresql.py (+27/-14)
To merge this branch: bzr merge lp:~stub/launchpad/bug-778338-transient-schema
Reviewer Review Type Date Requested Status
Launchpad code reviewers Pending
Review via email: mp+60652@code.launchpad.net

Description of the change

= Summary =

Multitable Copy creates transient tables. These may be discovered by the replication tools and make them explode because they don't know what to do with them.

== Proposed fix ==

Put the transient tables in their own schema, which will automatically be ignored. This seems cleaner than just white listing 'temp_%' and relying on a well known prefix.

== Pre-implementation notes ==

== Implementation details ==

== Tests ==

== Demo and Q/A ==

= Launchpad lint =

Checking for conflicts and issues in changed files.

Linting changed files:
  lib/canonical/database/postgresql.py
  lib/canonical/database/ftests/test_multitablecopy.txt
  lib/canonical/database/multitablecopy.py

./lib/canonical/database/ftests/test_multitablecopy.txt
       1: narrative uses a moin header.
      14: narrative uses a moin header.
      54: narrative uses a moin header.
      84: narrative uses a moin header.
     121: narrative uses a moin header.
     157: narrative uses a moin header.
     199: narrative exceeds 78 characters.
     244: narrative uses a moin header.
     274: narrative uses a moin header.
     276: narrative exceeds 78 characters.
     311: narrative exceeds 78 characters.
     439: narrative uses a moin header.
     486: narrative uses a moin header.
     498: narrative has trailing whitespace.
     590: narrative uses a moin header.
     658: narrative uses a moin header.

To post a comment you must log in.
13016. By Stuart Bishop

Merged bug-179821-lowercase-tablenames into bug-778338-transient-schema.

Unmerged revisions

13016. By Stuart Bishop

Merged bug-179821-lowercase-tablenames into bug-778338-transient-schema.

13015. By Stuart Bishop

delint

13014. By Stuart Bishop

Merged bug-179821-lowercase-tablenames into bug-778338-transient-schema.

13013. By Stuart Bishop

multitablecopy needs to create transient tables in a different schema so they are ignored for replication

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'lib/canonical/database/ftests/test_multitablecopy.txt'
2--- lib/canonical/database/ftests/test_multitablecopy.txt 2011-05-12 07:06:44 +0000
3+++ lib/canonical/database/ftests/test_multitablecopy.txt 2011-05-12 07:06:46 +0000
4@@ -111,9 +111,11 @@
5
6 We have the data we're copying in holding tables now.
7
8- >>> postgresql.have_table(cur, copier.getRawHoldingTableName('numeric'))
9+ >>> schema, name = copier.getRawHoldingTableName('numeric')
10+ >>> postgresql.have_table(cur, name, schema)
11 True
12- >>> postgresql.have_table(cur, copier.getRawHoldingTableName('textual'))
13+ >>> schema, name = copier.getRawHoldingTableName('numeric')
14+ >>> postgresql.have_table(cur, name, schema)
15 True
16
17
18@@ -145,9 +147,11 @@
19
20 And the holding tables are gone.
21
22- >>> postgresql.have_table(cur, copier.getRawHoldingTableName('numeric'))
23+ >>> schema, name = copier.getRawHoldingTableName('numeric')
24+ >>> postgresql.have_table(cur, name, schema)
25 False
26- >>> postgresql.have_table(cur, copier.getRawHoldingTableName('textual'))
27+ >>> schema, name = copier.getRawHoldingTableName('textual')
28+ >>> postgresql.have_table(cur, name, schema)
29 False
30
31
32@@ -280,15 +284,19 @@
33 >>> copier.needsRecovery()
34 False
35
36- >>> postgresql.have_table(cur, copier.getRawHoldingTableName('textual'))
37+ >>> schema, name = copier.getRawHoldingTableName('textual')
38+ >>> postgresql.have_table(cur, name, schema)
39 True
40- >>> postgresql.have_table(cur, copier.getRawHoldingTableName('numeric'))
41+ >>> schema, name = copier.getRawHoldingTableName('numeric')
42+ >>> postgresql.have_table(cur, name, schema)
43 True
44
45 >>> copier.dropHoldingTables()
46- >>> postgresql.have_table(cur, copier.getRawHoldingTableName('textual'))
47+ >>> schema, name = copier.getRawHoldingTableName('textual')
48+ >>> postgresql.have_table(cur, name, schema)
49 False
50- >>> postgresql.have_table(cur, copier.getRawHoldingTableName('numeric'))
51+ >>> schema, name = copier.getRawHoldingTableName('numeric')
52+ >>> postgresql.have_table(cur, name, schema)
53 False
54
55 >>> cur.execute("SELECT t, count(*) FROM textual GROUP BY t ORDER BY t")
56@@ -331,9 +339,11 @@
57 >>> transaction.begin()
58 <transaction...
59 >>> cur = cursor()
60- >>> postgresql.have_table(cur, copier.getRawHoldingTableName('textual'))
61+ >>> schema, name = copier.getRawHoldingTableName('textual')
62+ >>> postgresql.have_table(cur, name, schema)
63 False
64- >>> postgresql.have_table(cur, copier.getRawHoldingTableName('numeric'))
65+ >>> schema, name = copier.getRawHoldingTableName('numeric')
66+ >>> postgresql.have_table(cur, name, schema)
67 True
68
69 Our textual data has been copied, so the textual table now lists each of its
70@@ -370,9 +380,11 @@
71 This time we run to completion without problems.
72
73 >>> cur = cursor()
74- >>> postgresql.have_table(cur, copier.getRawHoldingTableName('textual'))
75+ >>> schema, name = copier.getRawHoldingTableName('textual')
76+ >>> postgresql.have_table(cur, name, schema)
77 False
78- >>> postgresql.have_table(cur, copier.getRawHoldingTableName('numeric'))
79+ >>> schema, name = copier.getRawHoldingTableName('numeric')
80+ >>> postgresql.have_table(cur, name, schema)
81 False
82
83 >>> cur.execute("""
84@@ -638,10 +650,10 @@
85
86 >>> copier.pour(transaction)
87 Pouring textual
88- Pouring text from "temp_textual_holding_test" to textual
89+ Pouring text from "transient"."textual_holding_test" to textual
90 ...
91 Pouring numeric
92- Pouring numbers from "temp_numeric_holding_test" to numeric
93+ Pouring numbers from "transient"."numeric_holding_test" to numeric
94 ...
95
96
97
98=== modified file 'lib/canonical/database/multitablecopy.py'
99--- lib/canonical/database/multitablecopy.py 2011-05-12 07:06:44 +0000
100+++ lib/canonical/database/multitablecopy.py 2011-05-12 07:06:46 +0000
101@@ -247,30 +247,31 @@
102
103 def dropHoldingTables(self):
104 """Drop any holding tables that may exist for this MultiTableCopy."""
105- holding_tables = [self.getHoldingTableName(table)
106- for table in self.tables]
107- postgresql.drop_tables(cursor(), holding_tables)
108+ for table in self.tables:
109+ schema, raw_table = self.getRawHoldingTableName(table)
110+ postgresql.drop_tables(cursor(), raw_table, schema)
111
112 def getRawHoldingTableName(self, tablename, suffix=''):
113- """Name for a holding table, but without quotes. Use with care."""
114+ """(schema, name) for a holding table, but without quotes."""
115 if suffix:
116 suffix = '_%s' % suffix
117
118 assert re.search(r'[^a-z_]', tablename + suffix) is None, (
119 'Unsupported characters in table name per Bug #179821')
120
121- raw_name = "temp_%s_holding_%s%s" % (
122+ raw_name = "%s_holding_%s%s" % (
123 str(tablename), self.name, str(suffix))
124
125- return raw_name
126+ # In a separate schema for transient tables per Bug #778338.
127+ return ("transient", raw_name)
128
129 def getHoldingTableName(self, tablename, suffix=''):
130 """Name for a holding table to hold data being copied in tablename.
131
132 Return value is properly quoted for use as an SQL identifier.
133 """
134- raw_name = self.getRawHoldingTableName(tablename, suffix)
135- return quoteIdentifier(raw_name)
136+ schema, tablename = self.getRawHoldingTableName(tablename, suffix)
137+ return "%s.%s" % (quoteIdentifier(schema), quoteIdentifier(tablename))
138
139 def _pointsToTable(self, source_table, foreign_key):
140 """Name of table that source_table.foreign_key refers to.
141@@ -485,14 +486,16 @@
142 """Index id column on holding table.
143
144 Creates a unique index on "id" column in holding table. The index
145- gets the name of the holding table with "_id" appended.
146+ gets the name of the holding table with "temp_" prepended and
147+ "_id" appended.
148 """
149 source_table = str(source_table)
150 self.logger.debug("Indexing %s" % holding_table)
151+ index_name = quoteIdentifier('temp_%s_id' % holding_table.lower())
152 cur.execute('''
153 CREATE UNIQUE INDEX %s
154 ON %s (id)
155- ''' % (self.getHoldingTableName(source_table, 'id'), holding_table))
156+ ''' % (index_name, holding_table))
157
158 def _retargetForeignKeys(self, holding_table, joins, cur):
159 """Replace foreign keys in new holding table.
160@@ -522,15 +525,18 @@
161 # If there are any holding tables to be poured into their source
162 # tables, there must at least be one for the last table that pour()
163 # processes.
164- last_holding_table = self.getRawHoldingTableName(self.tables[-1])
165- if not postgresql.have_table(cur, last_holding_table):
166+ schema, last_holding_table = self.getRawHoldingTableName(
167+ self.tables[-1])
168+ if not postgresql.have_table(cur, last_holding_table, schema):
169 return False
170
171 # If the first table in our list also still exists, and it still has
172 # its new_id column, then the pouring process had not begun yet.
173 # Assume the data was not ready for pouring.
174- first_holding_table = self.getRawHoldingTableName(self.tables[0])
175- if postgresql.table_has_column(cur, first_holding_table, 'new_id'):
176+ schema, first_holding_table = self.getRawHoldingTableName(
177+ self.tables[0])
178+ if postgresql.table_has_column(
179+ cur, first_holding_table, 'new_id', schema):
180 self.logger.info(
181 "Previous run aborted too early for recovery; redo all")
182 return False
183@@ -568,9 +574,10 @@
184 # there's a matching holding table. If so, prepare it, pour it back
185 # into the source table, and drop.
186 for table in self.tables:
187- holding_table_unquoted = self.getRawHoldingTableName(table)
188+ schema, holding_table_unquoted = self.getRawHoldingTableName(
189+ table)
190
191- if not postgresql.have_table(cur, holding_table_unquoted):
192+ if not postgresql.have_table(cur, holding_table_unquoted, schema):
193 # We know we're in a suitable state for pouring. If this
194 # table does not exist, it must be because it's been poured
195 # out completely and dropped in an earlier instance of this
196@@ -584,14 +591,14 @@
197 tablestarttime = time.time()
198
199 has_new_id = postgresql.table_has_column(
200- cur, holding_table_unquoted, 'new_id')
201+ cur, holding_table_unquoted, 'new_id', schema)
202
203 self._pourTable(
204 holding_table, table, has_new_id, transaction_manager)
205
206 # Drop holding table. It may still contain rows with id set to
207 # null. Those must not be poured.
208- postgresql.drop_tables(cursor(), holding_table)
209+ postgresql.drop_tables(cursor(), holding_table_unquoted, schema)
210
211 self.logger.debug(
212 "Pouring %s took %.3f seconds."
213
214=== modified file 'lib/canonical/database/postgresql.py'
215--- lib/canonical/database/postgresql.py 2009-09-21 08:13:40 +0000
216+++ lib/canonical/database/postgresql.py 2011-05-12 07:06:46 +0000
217@@ -85,7 +85,7 @@
218
219 for t in cur.fetchall():
220 # t == (src_table, src_column, dest_table, dest_column, upd, del)
221- if t not in _state: # Avoid loops
222+ if t not in _state: # Avoid loops
223 _state.append(t)
224 # Recurse, Locating references to the reference we just found.
225 listReferences(cur, t[0], t[1], _state)
226@@ -93,6 +93,7 @@
227 # from the original (table, column), making it easier to change keys
228 return _state
229
230+
231 def listUniques(cur, table, column):
232 '''Return a list of unique indexes on `table` that include the `column`
233
234@@ -175,6 +176,7 @@
235 rv.append(tuple(keys))
236 return rv
237
238+
239 def listSequences(cur):
240 """Return a list of (schema, sequence, table, column) tuples.
241
242@@ -206,7 +208,7 @@
243 for schema, sequence in list(cur.fetchall()):
244 match = re.search('^(\w+)_(\w+)_seq$', sequence)
245 if match is None:
246- rv.append( (schema, sequence, None, None) )
247+ rv.append((schema, sequence, None, None))
248 else:
249 table = match.group(1)
250 column = match.group(2)
251@@ -225,11 +227,12 @@
252 cur.execute(sql, dict(schema=schema, table=table, column=column))
253 num = cur.fetchone()[0]
254 if num == 1:
255- rv.append( (schema, sequence, table, column) )
256+ rv.append((schema, sequence, table, column))
257 else:
258- rv.append( (schema, sequence, None, None) )
259+ rv.append((schema, sequence, None, None))
260 return rv
261
262+
263 def generateResetSequencesSQL(cur):
264 """Return SQL that will reset table sequences to match the data in them.
265 """
266@@ -257,6 +260,7 @@
267 else:
268 return ''
269
270+
271 def resetSequences(cur):
272 """Reset table sequences to match the data in them.
273
274@@ -278,9 +282,11 @@
275 if sql:
276 cur.execute(sql)
277
278+
279 # Regular expression used to parse row count estimate from EXPLAIN output
280 _rows_re = re.compile("rows=(\d+)\swidth=")
281
282+
283 def estimateRowCount(cur, query):
284 """Ask the PostgreSQL query optimizer for an estimated rowcount.
285
286@@ -307,7 +313,7 @@
287 return int(match.group(1))
288
289
290-def have_table(cur, table):
291+def have_table(cur, table, schema='public'):
292 """Is there a table of the given name?
293
294 Returns boolean answer.
295@@ -324,12 +330,14 @@
296 cur.execute('''
297 SELECT count(*) > 0
298 FROM pg_tables
299- WHERE tablename=%s
300- ''' % str(quote(table)))
301+ WHERE
302+ schemaname=%s
303+ AND tablename=%s
304+ ''' % sqlvalues(schema, table))
305 return (cur.fetchall()[0][0] != 0)
306
307
308-def table_has_column(cur, table, column):
309+def table_has_column(cur, table, column, schema='public'):
310 """Does a table of the given name exist and have the given column?
311
312 Returns boolean answer.
313@@ -349,13 +357,16 @@
314 SELECT count(*) > 0
315 FROM pg_attribute
316 JOIN pg_class ON pg_class.oid = attrelid
317- WHERE relname=%s
318+ JOIN pg_namespace ON pg_class.relnamespace = pg_namespace.oid
319+ WHERE
320+ nspname=%s
321+ AND relname=%s
322 AND attname=%s
323- ''' % sqlvalues(table, column))
324+ ''' % sqlvalues(schema, table, column))
325 return (cur.fetchall()[0][0] != 0)
326
327
328-def drop_tables(cur, tables):
329+def drop_tables(cur, tables, schema='public'):
330 """Drop given tables (a list, one name, or None), if they exist.
331
332 >>> cur.execute("CREATE TEMP TABLE foo (a integer)")
333@@ -381,8 +392,10 @@
334 if isinstance(tables, basestring):
335 tables = [tables]
336
337+ fqns = [fqn(schema, table) for table in tables]
338+
339 # This syntax requires postgres 8.2 or better
340- cur.execute("DROP TABLE IF EXISTS %s" % ','.join(tables))
341+ cur.execute("DROP TABLE IF EXISTS %s" % ', '.join(fqns))
342
343
344 def allow_sequential_scans(cur, permission):
345@@ -422,9 +435,10 @@
346
347 cur.execute("SET enable_seqscan=%s" % permission_value)
348
349+
350 def all_tables_in_schema(cur, schema):
351 """Return a set of all tables in the given schema.
352-
353+
354 :returns: A set of quoted, fully qualified table names.
355 """
356 cur.execute("""
357@@ -569,4 +583,3 @@
358
359 for table, column in listReferences(cur, 'person', 'id'):
360 print '%32s %32s' % (table, column)
361-