Merge lp:~laurynas-biveinis/percona-server/i-s-temp-tables-fixes-5.1 into lp:percona-server/5.1

Proposed by Laurynas Biveinis
Status: Work in progress
Proposed branch: lp:~laurynas-biveinis/percona-server/i-s-temp-tables-fixes-5.1
Merge into: lp:percona-server/5.1
Diff against target: 151 lines (+50/-23)
2 files modified
Percona-Server/sql/sql_show.cc (+46/-23)
Percona-Server/storage/innodb_plugin/handler/ha_innodb.cc (+4/-0)
To merge this branch: bzr merge lp:~laurynas-biveinis/percona-server/i-s-temp-tables-fixes-5.1
Reviewer Review Type Date Requested Status
Alexey Kopytov (community) Needs Fixing
Laurynas Biveinis (community) Needs Fixing
Review via email: mp+185493@code.launchpad.net

Description of the change

No BT or ST but QA blocker.

Fix three INFORMATION_SCHEMA.TEMPORARY_TABLES and
INFORMATION_SCHEMA.GLOBAL_TEMPORARY_TABLES bugs:

- bug 1222709 (Bug 951588 fix needs revert and re-fixing);
- bug 1113388 (field.cc:3822: virtual longlong Field_long::val_int():
  Assertion `table->in_use == _current_thd()' failed);
- bug 1223335 (fill_temporary_tables() does not protect
  thd->temporary_tables access with LOCK_temporary_tables).

The bug 1222709 and its duplicats stem from the bug 951588 fix calling
handler::clone() on a temp table handler in an attempt to make
handler::info() call safe for concurrency as it would be called on a
non-shared handler then. The problem with this approach is that
handler::clone() itself is not safe for concurrent calls. Thus remove
its call, and fix the original bug 951588 issue by only calling
handler::info() when it is safe to do so, that is, when the current
I_S-query-executing thread is the same as the temp table owning one.
For the rest, do not call handler::info() and just read the existing
statistic values.

At the same time, if zeroes are encountered in the stat file fields,
assume that handler::info() has never been called on the temp table
handler and return NULL instead of 0.

At the same time revert the ha_innodb::clone() changes that were
required to support the previous bug 951588 fix use.

Fix bug 1113388 by removing the temporary THD::in_use patching in
fill_global_temporary_tables(), which now server no purpose and only
introduces a race condition as the field might be read by its owning
thread.

Fix bug 1223335 by locking thd->LOCK_temporary_tables around
thd->temporary_tables iteration in fill_temporary_tables().

http://jenkins.percona.com/job/percona-server-5.1-param/572/

To post a comment you must log in.
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

"// Assume that invoking handler::table_type() on a shared handler is safe" comment should be removed as well from store_temporary_table_record(). Too little an issue for a repush, thus will either fix if review requires repush otherwise, either before merging.

review: Needs Fixing
Revision history for this message
Alexey Kopytov (akopytov) wrote :

  - bug #1223335 is invalid, so references and the “fix” should be
    removed
  - the revision comments provide almost no information on the problem
    being addressed. So “The problem with this approach is that
    handler::clone() itself is not safe for concurrent calls.” Why?
  - even if there are reasons to not use handler::clone(), I don’t like
    the fix. The fix introduces a rather important change in
    functionality: I_S.GLOBAL_TEMPORARY_TABLES will not show table
    sizes. Which basically makes it useless. So I’d like to be sure
    there are no alternative solutions. But first I need to understand
    the problem (see above).
  - “At the same time, if zeroes are encountered in the stat file fields,
    assume that handler::info() has never been called on the temp table
    handler and return NULL instead of 0.” — that won’t work. Couple of
    issues:
      - in [GLOBAL_]TEMPORARY_TABLES there are only 2 nullable columns:
        CREATE_TIME and UPDATE_TIME. So assuming that the value will be
        NULL if set_not_null() is not called is wrong. And the call to
        set_not_null() is pointless.
      - even if that worked, assuming zero values to indicate that
        info() was not called is wrong. They will be 0 for an empty
        table, for example.
  - spurious change at line 84

review: Needs Fixing
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Alexey -

> being addressed. So “The problem with this approach is that
> handler::clone() itself is not safe for concurrent calls.” Why?
> - even if there are reasons to not use handler::clone(), I don’t like
> the fix. The fix introduces a rather important change in
> functionality: I_S.GLOBAL_TEMPORARY_TABLES will not show table
> sizes. Which basically makes it useless. So I’d like to be sure
> there are no alternative solutions. But first I need to understand
> the problem (see above).

My (probably incorrect) guess on ::clone() being unsafe for concurrent calls was based on the sheer number of different crashing bugs found with the RQG testing, listed in the bug 1222709 description, and the fact that ha_innobase::clone() in the upstream had code that needed to be removed in order for it to work when called from another thread. Previous attempts to fix those crashing bugs resulted in elaborate patching throughout ::clone() methods of many storage engines, and even in the handler core - see https://code.launchpad.net/~laurynas-biveinis/percona-server/i-s-temp-tables-fixes-1-5.1/+merge/178092. Even with those fixes I couldn't figure out how to create fully safe conditions for clone(), that's why I thought that clone() was never meant to be used like that.

But now, in addition to your review, I also just found a recent push [1]. Which suggests that a safe clone() from another thread should be possible, and that I should be able to find how it's supposed to be used in PFS sources. I will look into this.

[1]

5.6$ bzr log -r 5356
------------------------------------------------------------
revno: 5356
committer: Annamalai Gurusami <email address hidden>
branch nick: mysql-5.6
timestamp: Wed 2013-08-14 14:57:39 +0530
message:
  Bug #17001980 RQG_PERFORMANCE_SCHEMA STOPPED WITH A CRASH IN
  HA_INNOBASE::CLONE.

  Problem:

  In the function ha_innobase::clone(), it was asserted that a thread
  cannot clone a table handler that is used by another thread. It was
  also asserted that the original table handler and the cloned table
  handler must belong to the same transaction. Both are wrong.

  Solution:

  It is OK for a thread to clone a table handler that is used by another
  thread. The original table handler and the cloned table handler can
  belong to different transactions. So both asserts are removed.

  rb#3066 approved by Marko

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

> fix those crashing bugs resulted in elaborate patching throughout ::clone()
> methods of many storage engines, and even in the handler core

Errr, that's not fully correct, while the crashes fixed were in many storage engines, the patching in that MP was more localized.

Unmerged revisions

578. By Laurynas Biveinis

Fix three INFORMATION_SCHEMA.TEMPORARY_TABLES and
INFORMATION_SCHEMA.GLOBAL_TEMPORARY_TABLES bugs:

- bug 1222709 (Bug 951588 fix needs revert and re-fixing);
- bug 1113388 (field.cc:3822: virtual longlong Field_long::val_int():
  Assertion `table->in_use == _current_thd()' failed);
- bug 1223335 (fill_temporary_tables() does not protect
  thd->temporary_tables access with LOCK_temporary_tables).

The bug 1222709 and its duplicats stem from the bug 951588 fix calling
handler::clone() on a temp table handler in an attempt to make
handler::info() call safe for concurrency as it would be called on a
non-shared handler then. The problem with this approach is that
handler::clone() itself is not safe for concurrent calls. Thus remove
its call, and fix the original bug 951588 issue by only calling
handler::info() when it is safe to do so, that is, when the current
I_S-query-executing thread is the same as the temp table owning one.
For the rest, do not call handler::info() and just read the existing
statistic values.

At the same time, if zeroes are encountered in the stat file fields,
assume that handler::info() has never been called on the temp table
handler and return NULL instead of 0.

At the same time revert the ha_innodb::clone() changes that were
required to support the previous bug 951588 fix use.

Fix bug 1113388 by removing the temporary THD::in_use patching in
fill_global_temporary_tables(), which now server no purpose and only
introduces a race condition as the field might be read by its owning
thread.

Fix bug 1223335 by locking thd->LOCK_temporary_tables around
thd->temporary_tables iteration in fill_temporary_tables().

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'Percona-Server/sql/sql_show.cc'
2--- Percona-Server/sql/sql_show.cc 2013-05-31 12:43:01 +0000
3+++ Percona-Server/sql/sql_show.cc 2013-09-13 13:36:27 +0000
4@@ -3642,13 +3642,20 @@
5 @param[in] table I_S table
6 @param[in] tmp_table temporary table
7 @param[in] db database name
8+ @param[in] table_name_only whether to return only session_id,
9+ database, and table fields
10+ @param[in] thd_is_current_thd whether the temp table belongs to
11+ the current connection
12
13 @return Operation status
14 @retval 0 success
15 @retval 1 error
16 */
17
18-static int store_temporary_table_record(THD *thd, TABLE *table, TABLE *tmp_table, const char *db, bool table_name_only)
19+static int store_temporary_table_record(THD *thd, TABLE *table,
20+ TABLE *tmp_table, const char *db,
21+ bool table_name_only,
22+ bool thd_is_current_thd)
23 {
24 CHARSET_INFO *cs= system_charset_info;
25 DBUG_ENTER("store_temporary_table_record");
26@@ -3688,11 +3695,6 @@
27
28 MYSQL_TIME time;
29
30- /* We have only one handler object for a temp table globally and it might
31- be in use by other thread. Do not trash it by invoking handler methods on
32- it but rather clone it. */
33- file = file->clone(tmp_table->s->normalized_path.str, thd->mem_root);
34-
35 /**
36 TODO: InnoDB stat(file) checks file on short names within data dictionary
37 rather than using full path, because of that, temp files created in
38@@ -3700,14 +3702,33 @@
39
40 The fix is to patch InnoDB to use full path
41 */
42- file->info(HA_STATUS_VARIABLE | HA_STATUS_TIME | HA_STATUS_NO_LOCK);
43-
44- table->field[5]->store((longlong) file->stats.records, TRUE);
45- table->field[5]->set_notnull();
46-
47- table->field[6]->store((longlong) file->stats.mean_rec_length, TRUE);
48- table->field[7]->store((longlong) file->stats.data_file_length, TRUE);
49- table->field[8]->store((longlong) file->stats.index_file_length, TRUE);
50+ if (thd_is_current_thd)
51+ file->info(HA_STATUS_VARIABLE | HA_STATUS_TIME | HA_STATUS_NO_LOCK);
52+
53+ if (file->stats.records)
54+ {
55+ table->field[5]->store((longlong) file->stats.records, TRUE);
56+ table->field[5]->set_notnull();
57+ }
58+
59+ if (file->stats.mean_rec_length)
60+ {
61+ table->field[6]->store((longlong) file->stats.mean_rec_length, TRUE);
62+ table->field[6]->set_notnull();
63+ }
64+
65+ if (file->stats.data_file_length)
66+ {
67+ table->field[7]->store((longlong) file->stats.data_file_length, TRUE);
68+ table->field[7]->set_notnull();
69+ }
70+
71+ if (file->stats.index_file_length)
72+ {
73+ table->field[8]->store((longlong) file->stats.index_file_length, TRUE);
74+ table->field[8]->set_notnull();
75+ }
76+
77 if (file->stats.create_time)
78 {
79 thd->variables.time_zone->gmt_sec_to_TIME(&time,
80@@ -3715,6 +3736,7 @@
81 table->field[9]->store_time(&time, MYSQL_TIMESTAMP_DATETIME);
82 table->field[9]->set_notnull();
83 }
84+
85 if (file->stats.update_time)
86 {
87 thd->variables.time_zone->gmt_sec_to_TIME(&time,
88@@ -3722,8 +3744,6 @@
89 table->field[10]->store_time(&time, MYSQL_TIMESTAMP_DATETIME);
90 table->field[10]->set_notnull();
91 }
92-
93- file->close();
94 }
95
96 DBUG_RETURN(schema_table_store_record(thd, table));
97@@ -3773,17 +3793,14 @@
98 }
99 #endif
100
101- THD *t= tmp->in_use;
102- tmp->in_use= thd;
103-
104- if (store_temporary_table_record(thd_item, tables->table, tmp, thd->lex->select_lex.db, table_names_only)) {
105- tmp->in_use= t;
106+ if (store_temporary_table_record(thd_item, tables->table, tmp,
107+ thd->lex->select_lex.db,
108+ table_names_only, thd_item == thd)) {
109 pthread_mutex_unlock(&thd_item->LOCK_temporary_tables);
110 pthread_mutex_unlock(&LOCK_thread_count);
111 DBUG_RETURN(1);
112 }
113
114- tmp->in_use= t;
115 }
116 pthread_mutex_unlock(&thd_item->LOCK_temporary_tables);
117 }
118@@ -3814,11 +3831,17 @@
119 bool table_names_only= (thd->lex->sql_command == SQLCOM_SHOW_TEMPORARY_TABLES) ? 1 : 0;
120 TABLE *tmp;
121
122+ pthread_mutex_lock(&thd->LOCK_temporary_tables);
123+
124 for (tmp=thd->temporary_tables; tmp; tmp=tmp->next) {
125- if (store_temporary_table_record(thd, tables->table, tmp, thd->lex->select_lex.db, table_names_only)) {
126+ if (store_temporary_table_record(thd, tables->table, tmp,
127+ thd->lex->select_lex.db,
128+ table_names_only, true)) {
129+ pthread_mutex_unlock(&thd->LOCK_temporary_tables);
130 DBUG_RETURN(1);
131 }
132 }
133+ pthread_mutex_unlock(&thd->LOCK_temporary_tables);
134 DBUG_RETURN(0);
135 }
136
137
138=== modified file 'Percona-Server/storage/innodb_plugin/handler/ha_innodb.cc'
139--- Percona-Server/storage/innodb_plugin/handler/ha_innodb.cc 2013-08-23 07:34:57 +0000
140+++ Percona-Server/storage/innodb_plugin/handler/ha_innodb.cc 2013-09-13 13:36:27 +0000
141@@ -4537,6 +4537,10 @@
142 new_handler = static_cast<ha_innobase*>(handler::clone(name,
143 mem_root));
144 if (new_handler) {
145+ DBUG_ASSERT(new_handler->prebuilt != NULL);
146+ DBUG_ASSERT(new_handler->user_thd == user_thd);
147+ DBUG_ASSERT(new_handler->prebuilt->trx == prebuilt->trx);
148+
149 new_handler->prebuilt->select_lock_type
150 = prebuilt->select_lock_type;
151 }

Subscribers

People subscribed via source and target branches