Launchpad itself

Merge lp:~stub/launchpad/bug-778338-transient-schema into lp:launchpad

bug-778338-transient-schema
Merge into devel

Proposed by Stuart Bishop on 2011-05-11

Status:

Work in progress

Proposed branch:

lp:~stub/launchpad/bug-778338-transient-schema

Merge into:

lp:launchpad

Prerequisite:

lp:~stub/launchpad/bug-179821-lowercase-tablenames

Diff against target:

361 lines (+78/-46)

3 files modified

lib/canonical/database/ftests/test_multitablecopy.txt (+26/-14)
lib/canonical/database/multitablecopy.py (+25/-18)
lib/canonical/database/postgresql.py (+27/-14)

To merge this branch:

bzr merge lp:~stub/launchpad/bug-778338-transient-schema

Critical

Fix Released

Link a bug report

Reviewer	Review Type	Date Requested	Status
Launchpad code reviewers		2011-05-11	Pending
Review via email: mp+60652@code.launchpad.net

Description of the change

= Summary =

Multitable Copy creates transient tables. These may be discovered by the replication tools and make them explode because they don't know what to do with them.

== Proposed fix ==

Put the transient tables in their own schema, which will automatically be ignored. This seems cleaner than just white listing 'temp_%' and relying on a well known prefix.

== Pre-implementation notes ==

== Implementation details ==

== Tests ==

== Demo and Q/A ==

= Launchpad lint =

Checking for conflicts and issues in changed files.

Linting changed files:
  lib/canonical/database/postgresql.py
  lib/canonical/database/ftests/test_multitablecopy.txt
  lib/canonical/database/multitablecopy.py

./lib/canonical/database/ftests/test_multitablecopy.txt
       1: narrative uses a moin header.
      14: narrative uses a moin header.
      54: narrative uses a moin header.
      84: narrative uses a moin header.
     121: narrative uses a moin header.
     157: narrative uses a moin header.
     199: narrative exceeds 78 characters.
     244: narrative uses a moin header.
     274: narrative uses a moin header.
     276: narrative exceeds 78 characters.
     311: narrative exceeds 78 characters.
     439: narrative uses a moin header.
     486: narrative uses a moin header.
     498: narrative has trailing whitespace.
     590: narrative uses a moin header.
     658: narrative uses a moin header.

lp:~stub/launchpad/bug-778338-transient-schema updated on 2011-05-12

13016. By Stuart Bishop on 2011-05-12: Merged bug-179821-lowercase-tablenames into bug-778338-transient-schema.

Unmerged revisions

13016. By Stuart Bishop on 2011-05-12: Merged bug-179821-lowercase-tablenames into bug-778338-transient-schema.
13015. By Stuart Bishop on 2011-05-11: delint
13014. By Stuart Bishop on 2011-05-11: Merged bug-179821-lowercase-tablenames into bug-778338-transient-schema.
13013. By Stuart Bishop on 2011-05-11: multitablecopy needs to create transient tables in a different schema so they are ignored for replication

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Barki Mustapha

Celso Providelo

Christian Reis

Christy Awad

Colin Watson

Harpianto,ANDI

James Troup

John A Meinel

Kevin bush

Launchpad code reviewers

Launchpad code reviewers from Canonical

Matthew Tanner

Maximiliano Bertacchini

Oguz Ersoz

Simon Brakhane

Stuart Bishop

Ubuntu-BR DevOps

William Grant

alhawiti

api.ng

pedro cavazos

todaioan

wenjingwen

to status/vote changes:

Tzaddi

Tzaddi Belding

1	=== modified file 'lib/canonical/database/ftests/test_multitablecopy.txt'
2	--- lib/canonical/database/ftests/test_multitablecopy.txt 2011-05-12 07:06:44 +0000
3	+++ lib/canonical/database/ftests/test_multitablecopy.txt 2011-05-12 07:06:46 +0000
4	@@ -111,9 +111,11 @@
5
6	We have the data we're copying in holding tables now.
7
8	- >>> postgresql.have_table(cur, copier.getRawHoldingTableName('numeric'))
9	+ >>> schema, name = copier.getRawHoldingTableName('numeric')
10	+ >>> postgresql.have_table(cur, name, schema)
11	True
12	- >>> postgresql.have_table(cur, copier.getRawHoldingTableName('textual'))
13	+ >>> schema, name = copier.getRawHoldingTableName('numeric')
14	+ >>> postgresql.have_table(cur, name, schema)
15	True
16
17
18	@@ -145,9 +147,11 @@
19
20	And the holding tables are gone.
21
22	- >>> postgresql.have_table(cur, copier.getRawHoldingTableName('numeric'))
23	+ >>> schema, name = copier.getRawHoldingTableName('numeric')
24	+ >>> postgresql.have_table(cur, name, schema)
25	False
26	- >>> postgresql.have_table(cur, copier.getRawHoldingTableName('textual'))
27	+ >>> schema, name = copier.getRawHoldingTableName('textual')
28	+ >>> postgresql.have_table(cur, name, schema)
29	False
30
31
32	@@ -280,15 +284,19 @@
33	>>> copier.needsRecovery()
34	False
35
36	- >>> postgresql.have_table(cur, copier.getRawHoldingTableName('textual'))
37	+ >>> schema, name = copier.getRawHoldingTableName('textual')
38	+ >>> postgresql.have_table(cur, name, schema)
39	True
40	- >>> postgresql.have_table(cur, copier.getRawHoldingTableName('numeric'))
41	+ >>> schema, name = copier.getRawHoldingTableName('numeric')
42	+ >>> postgresql.have_table(cur, name, schema)
43	True
44
45	>>> copier.dropHoldingTables()
46	- >>> postgresql.have_table(cur, copier.getRawHoldingTableName('textual'))
47	+ >>> schema, name = copier.getRawHoldingTableName('textual')
48	+ >>> postgresql.have_table(cur, name, schema)
49	False
50	- >>> postgresql.have_table(cur, copier.getRawHoldingTableName('numeric'))
51	+ >>> schema, name = copier.getRawHoldingTableName('numeric')
52	+ >>> postgresql.have_table(cur, name, schema)
53	False
54
55	>>> cur.execute("SELECT t, count(*) FROM textual GROUP BY t ORDER BY t")
56	@@ -331,9 +339,11 @@
57	>>> transaction.begin()
58	<transaction...
59	>>> cur = cursor()
60	- >>> postgresql.have_table(cur, copier.getRawHoldingTableName('textual'))
61	+ >>> schema, name = copier.getRawHoldingTableName('textual')
62	+ >>> postgresql.have_table(cur, name, schema)
63	False
64	- >>> postgresql.have_table(cur, copier.getRawHoldingTableName('numeric'))
65	+ >>> schema, name = copier.getRawHoldingTableName('numeric')
66	+ >>> postgresql.have_table(cur, name, schema)
67	True
68
69	Our textual data has been copied, so the textual table now lists each of its
70	@@ -370,9 +380,11 @@
71	This time we run to completion without problems.
72
73	>>> cur = cursor()
74	- >>> postgresql.have_table(cur, copier.getRawHoldingTableName('textual'))
75	+ >>> schema, name = copier.getRawHoldingTableName('textual')
76	+ >>> postgresql.have_table(cur, name, schema)
77	False
78	- >>> postgresql.have_table(cur, copier.getRawHoldingTableName('numeric'))
79	+ >>> schema, name = copier.getRawHoldingTableName('numeric')
80	+ >>> postgresql.have_table(cur, name, schema)
81	False
82
83	>>> cur.execute("""
84	@@ -638,10 +650,10 @@
85
86	>>> copier.pour(transaction)
87	Pouring textual
88	- Pouring text from "temp_textual_holding_test" to textual
89	+ Pouring text from "transient"."textual_holding_test" to textual
90	...
91	Pouring numeric
92	- Pouring numbers from "temp_numeric_holding_test" to numeric
93	+ Pouring numbers from "transient"."numeric_holding_test" to numeric
94	...
95
96
97
98	=== modified file 'lib/canonical/database/multitablecopy.py'
99	--- lib/canonical/database/multitablecopy.py 2011-05-12 07:06:44 +0000
100	+++ lib/canonical/database/multitablecopy.py 2011-05-12 07:06:46 +0000
101	@@ -247,30 +247,31 @@
102
103	def dropHoldingTables(self):
104	"""Drop any holding tables that may exist for this MultiTableCopy."""
105	- holding_tables = [self.getHoldingTableName(table)
106	- for table in self.tables]
107	- postgresql.drop_tables(cursor(), holding_tables)
108	+ for table in self.tables:
109	+ schema, raw_table = self.getRawHoldingTableName(table)
110	+ postgresql.drop_tables(cursor(), raw_table, schema)
111
112	def getRawHoldingTableName(self, tablename, suffix=''):
113	- """Name for a holding table, but without quotes. Use with care."""
114	+ """(schema, name) for a holding table, but without quotes."""
115	if suffix:
116	suffix = '_%s' % suffix
117
118	assert re.search(r'[^a-z_]', tablename + suffix) is None, (
119	'Unsupported characters in table name per Bug #179821')
120
121	- raw_name = "temp_%s_holding_%s%s" % (
122	+ raw_name = "%s_holding_%s%s" % (
123	str(tablename), self.name, str(suffix))
124
125	- return raw_name
126	+ # In a separate schema for transient tables per Bug #778338.
127	+ return ("transient", raw_name)
128
129	def getHoldingTableName(self, tablename, suffix=''):
130	"""Name for a holding table to hold data being copied in tablename.
131
132	Return value is properly quoted for use as an SQL identifier.
133	"""
134	- raw_name = self.getRawHoldingTableName(tablename, suffix)
135	- return quoteIdentifier(raw_name)
136	+ schema, tablename = self.getRawHoldingTableName(tablename, suffix)
137	+ return "%s.%s" % (quoteIdentifier(schema), quoteIdentifier(tablename))
138
139	def _pointsToTable(self, source_table, foreign_key):
140	"""Name of table that source_table.foreign_key refers to.
141	@@ -485,14 +486,16 @@
142	"""Index id column on holding table.
143
144	Creates a unique index on "id" column in holding table. The index
145	- gets the name of the holding table with "_id" appended.
146	+ gets the name of the holding table with "temp_" prepended and
147	+ "_id" appended.
148	"""
149	source_table = str(source_table)
150	self.logger.debug("Indexing %s" % holding_table)
151	+ index_name = quoteIdentifier('temp_%s_id' % holding_table.lower())
152	cur.execute('''
153	CREATE UNIQUE INDEX %s
154	ON %s (id)
155	- ''' % (self.getHoldingTableName(source_table, 'id'), holding_table))
156	+ ''' % (index_name, holding_table))
157
158	def _retargetForeignKeys(self, holding_table, joins, cur):
159	"""Replace foreign keys in new holding table.
160	@@ -522,15 +525,18 @@
161	# If there are any holding tables to be poured into their source
162	# tables, there must at least be one for the last table that pour()
163	# processes.
164	- last_holding_table = self.getRawHoldingTableName(self.tables[-1])
165	- if not postgresql.have_table(cur, last_holding_table):
166	+ schema, last_holding_table = self.getRawHoldingTableName(
167	+ self.tables[-1])
168	+ if not postgresql.have_table(cur, last_holding_table, schema):
169	return False
170
171	# If the first table in our list also still exists, and it still has
172	# its new_id column, then the pouring process had not begun yet.
173	# Assume the data was not ready for pouring.
174	- first_holding_table = self.getRawHoldingTableName(self.tables[0])
175	- if postgresql.table_has_column(cur, first_holding_table, 'new_id'):
176	+ schema, first_holding_table = self.getRawHoldingTableName(
177	+ self.tables[0])
178	+ if postgresql.table_has_column(
179	+ cur, first_holding_table, 'new_id', schema):
180	self.logger.info(
181	"Previous run aborted too early for recovery; redo all")
182	return False
183	@@ -568,9 +574,10 @@
184	# there's a matching holding table. If so, prepare it, pour it back
185	# into the source table, and drop.
186	for table in self.tables:
187	- holding_table_unquoted = self.getRawHoldingTableName(table)
188	+ schema, holding_table_unquoted = self.getRawHoldingTableName(
189	+ table)
190
191	- if not postgresql.have_table(cur, holding_table_unquoted):
192	+ if not postgresql.have_table(cur, holding_table_unquoted, schema):
193	# We know we're in a suitable state for pouring. If this
194	# table does not exist, it must be because it's been poured
195	# out completely and dropped in an earlier instance of this
196	@@ -584,14 +591,14 @@
197	tablestarttime = time.time()
198
199	has_new_id = postgresql.table_has_column(
200	- cur, holding_table_unquoted, 'new_id')
201	+ cur, holding_table_unquoted, 'new_id', schema)
202
203	self._pourTable(
204	holding_table, table, has_new_id, transaction_manager)
205
206	# Drop holding table. It may still contain rows with id set to
207	# null. Those must not be poured.
208	- postgresql.drop_tables(cursor(), holding_table)
209	+ postgresql.drop_tables(cursor(), holding_table_unquoted, schema)
210
211	self.logger.debug(
212	"Pouring %s took %.3f seconds."
213
214	=== modified file 'lib/canonical/database/postgresql.py'
215	--- lib/canonical/database/postgresql.py 2009-09-21 08:13:40 +0000
216	+++ lib/canonical/database/postgresql.py 2011-05-12 07:06:46 +0000
217	@@ -85,7 +85,7 @@
218
219	for t in cur.fetchall():
220	# t == (src_table, src_column, dest_table, dest_column, upd, del)
221	- if t not in _state: # Avoid loops
222	+ if t not in _state: # Avoid loops
223	_state.append(t)
224	# Recurse, Locating references to the reference we just found.
225	listReferences(cur, t[0], t[1], _state)
226	@@ -93,6 +93,7 @@
227	# from the original (table, column), making it easier to change keys
228	return _state
229
230	+
231	def listUniques(cur, table, column):
232	'''Return a list of unique indexes on `table` that include the `column`
233
234	@@ -175,6 +176,7 @@
235	rv.append(tuple(keys))
236	return rv
237
238	+
239	def listSequences(cur):
240	"""Return a list of (schema, sequence, table, column) tuples.
241
242	@@ -206,7 +208,7 @@
243	for schema, sequence in list(cur.fetchall()):
244	match = re.search('^(\w+)_(\w+)_seq$', sequence)
245	if match is None:
246	- rv.append( (schema, sequence, None, None) )
247	+ rv.append((schema, sequence, None, None))
248	else:
249	table = match.group(1)
250	column = match.group(2)
251	@@ -225,11 +227,12 @@
252	cur.execute(sql, dict(schema=schema, table=table, column=column))
253	num = cur.fetchone()[0]
254	if num == 1:
255	- rv.append( (schema, sequence, table, column) )
256	+ rv.append((schema, sequence, table, column))
257	else:
258	- rv.append( (schema, sequence, None, None) )
259	+ rv.append((schema, sequence, None, None))
260	return rv
261
262	+
263	def generateResetSequencesSQL(cur):
264	"""Return SQL that will reset table sequences to match the data in them.
265	"""
266	@@ -257,6 +260,7 @@
267	else:
268	return ''
269
270	+
271	def resetSequences(cur):
272	"""Reset table sequences to match the data in them.
273
274	@@ -278,9 +282,11 @@
275	if sql:
276	cur.execute(sql)
277
278	+
279	# Regular expression used to parse row count estimate from EXPLAIN output
280	_rows_re = re.compile("rows=(\d+)\swidth=")
281
282	+
283	def estimateRowCount(cur, query):
284	"""Ask the PostgreSQL query optimizer for an estimated rowcount.
285
286	@@ -307,7 +313,7 @@
287	return int(match.group(1))
288
289
290	-def have_table(cur, table):
291	+def have_table(cur, table, schema='public'):
292	"""Is there a table of the given name?
293
294	Returns boolean answer.
295	@@ -324,12 +330,14 @@
296	cur.execute('''
297	SELECT count(*) > 0
298	FROM pg_tables
299	- WHERE tablename=%s
300	- ''' % str(quote(table)))
301	+ WHERE
302	+ schemaname=%s
303	+ AND tablename=%s
304	+ ''' % sqlvalues(schema, table))
305	return (cur.fetchall()[0][0] != 0)
306
307
308	-def table_has_column(cur, table, column):
309	+def table_has_column(cur, table, column, schema='public'):
310	"""Does a table of the given name exist and have the given column?
311
312	Returns boolean answer.
313	@@ -349,13 +357,16 @@
314	SELECT count(*) > 0
315	FROM pg_attribute
316	JOIN pg_class ON pg_class.oid = attrelid
317	- WHERE relname=%s
318	+ JOIN pg_namespace ON pg_class.relnamespace = pg_namespace.oid
319	+ WHERE
320	+ nspname=%s
321	+ AND relname=%s
322	AND attname=%s
323	- ''' % sqlvalues(table, column))
324	+ ''' % sqlvalues(schema, table, column))
325	return (cur.fetchall()[0][0] != 0)
326
327
328	-def drop_tables(cur, tables):
329	+def drop_tables(cur, tables, schema='public'):
330	"""Drop given tables (a list, one name, or None), if they exist.
331
332	>>> cur.execute("CREATE TEMP TABLE foo (a integer)")
333	@@ -381,8 +392,10 @@
334	if isinstance(tables, basestring):
335	tables = [tables]
336
337	+ fqns = [fqn(schema, table) for table in tables]
338	+
339	# This syntax requires postgres 8.2 or better
340	- cur.execute("DROP TABLE IF EXISTS %s" % ','.join(tables))
341	+ cur.execute("DROP TABLE IF EXISTS %s" % ', '.join(fqns))
342
343
344	def allow_sequential_scans(cur, permission):
345	@@ -422,9 +435,10 @@
346
347	cur.execute("SET enable_seqscan=%s" % permission_value)
348
349	+
350	def all_tables_in_schema(cur, schema):
351	"""Return a set of all tables in the given schema.
352	-
353	+
354	:returns: A set of quoted, fully qualified table names.
355	"""
356	cur.execute("""
357	@@ -569,4 +583,3 @@
358
359	for table, column in listReferences(cur, 'person', 'id'):
360	print '%32s %32s' % (table, column)
361	-