pt-query-digest fails to set history columns for disk tmp tables and disk filesort

Bug #963225 reported by Rob Smith
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
Fix Released
Medium
Brian Fraser
2.0
Fix Released
Medium
Daniel Nichter
2.1
Fix Released
Medium
Brian Fraser

Bug Description

Using current pt-query-digest with the history table defined as specified here:
http://www.percona.com/doc/percona-toolkit/2.0/pt-query-digest.html#cmdoption-pt-query-digest--review-history

Disk_tmp_table_cnt
Disk_tmp_table_sum
Disk_filesort_cnt
Disk_filesort_sum

And if I check the slow query logs, I see many lines similar to this:

# Query_time: 18.449333 Lock_time: 0.000067 Rows_sent: 218406 Rows_examined: 16450308 Rows_affected: 0 Rows_read: 213971
# Bytes_sent: 8136552 Tmp_tables: 1 Tmp_disk_tables: 1 Tmp_table_sizes: 76878912
# InnoDB_trx_id: 29DC5005A
# QC_Hit: No Full_scan: Yes Full_join: No Tmp_table: Yes Tmp_table_on_disk: Yes
# Filesort: Yes Filesort_on_disk: Yes Merge_passes: 10
SET timestamp=1332279707;

As you can see, this query is using the ondisk tmp tables and on disk filesorts, and yet the columns are null for the checksum

mysql> SELECT COUNT(0), Disk_tmp_table_cnt FROM query_review_history WHERE checksum = '16760467266904871542' GROUP BY Disk_tmp_table_cnt\G
*************************** 1. row ***************************
          COUNT(0): 34
Disk_tmp_table_cnt: NULL
1 row in set (0.00 sec)

mysql> SELECT COUNT(0), Tmp_table_cnt FROM query_review_history WHERE checksum = '16760467266904871542' GROUP BY Tmp_table_cnt\G
*************************** 1. row ***************************
     COUNT(0): 15
Tmp_table_cnt: NULL
*************************** 2. row ***************************
     COUNT(0): 2
Tmp_table_cnt: 1
and so on

This is very important information that is getting lost in the process

Related branches

Revision history for this message
Rob Smith (kormoc) wrote :

Looking at the code it looks like perhaps the slow query output was changed? I see the code expecting this:
# QC_Hit: %s Full_scan: %s Full_join: %s Tmp_table: %s Disk_tmp_table: %s
# Filesort: %s Disk_filesort: %s Merge_passes: %d
and I have this
# QC_Hit: No Full_scan: Yes Full_join: No Tmp_table: Yes Tmp_table_on_disk: Yes
# Filesort: Yes Filesort_on_disk: Yes Merge_passes: 10

Was this perhaps changed in a later version of percona server? I'm currently running 5.1.56-rel12.7, but that said, I think it is important to allow processing/reprocessing of old slow query logs. For example, when the innodb transaction field hex bug was fixes, I went ahead and reprocessed all my slow query logs and 'found' a few dozen new queries that were masked by the bug

Revision history for this message
Baron Schwartz (baron-xaprb) wrote :

That looks like the problem, yes. However I can say there is way too much complexity in this code already to deal with various stupidities such as this, and I'm going to make an executive decision and say we won't "fix" it in the tool. I will send some stern words of displeasure to whomever needlessly changed the log format, and the workaround in your case is to alter the table and change the column names to match the new log format. Apologies for the inconvenience.

Revision history for this message
Baron Schwartz (baron-xaprb) wrote :

Actually, there is an option to alias an attribute as another. Another way to fix the problem is by just updating the default value of that option. See --attribute-aliases.

However, this option WILL go away in the future to simplify this code.

Revision history for this message
Rob Smith (kormoc) wrote :

Changing the tables and reprocessing the logs did the trick. Thanks much.

I would suggest that the table definition in the docs be updated to use the new names so others don't have to find out later that they are missing data like I did.

Thanks for your time!

Revision history for this message
Baron Schwartz (baron-xaprb) wrote :

Rob, it looks like I jumped the gun. We checked Percona Server's source history and these attributes haven't changed their names. What servers are you using that have different attribute names?

Revision history for this message
Rob Smith (kormoc) wrote :

[SLAVE root@db001 ~]# rpm -qi Percona-Server-server-51-5.1.56-rel12.7.224.rhel5
Name : Percona-Server-server-51 Relocations: (not relocatable)
Version : 5.1.56 Vendor: Percona, Inc
Release : rel12.7.224.rhel5 Build Date: Fri 15 Apr 2011 02:15:42 PM PDT
Install Date: Wed 27 Apr 2011 04:52:24 PM PDT Build Host: sm1u02.gni.percona.com
Group : Applications/Databases Source RPM: Percona-Server-51-5.1.56-rel12.7.224.rhel5.src.rpm
Size : 45149140 License: GPL version 2 http://www.gnu.org/licenses/gpl-2.0.html
Signature : DSA/SHA1, Mon 18 Apr 2011 11:28:10 AM PDT, Key ID 1c4cbdcdcd2efd2a
Packager : Percona, Inc Development Team <email address hidden>
URL : http://www.percona.com/
Summary : Percona Server (GPL), 12.7, Revision 224 for Red Hat Enterprise Linux 5

[SLAVE root@db001 log]# tail -f slow_queries.log
# User@Host: User[User] @ Host [ip]
# Thread_id: 22729487 Schema: db Last_errno: 0 Killed: 0
# Query_time: 5.082390 Lock_time: 0.014276 Rows_sent: 2 Rows_examined: 15797857 Rows_affected: 0 Rows_read: 7718504
# Bytes_sent: 140 Tmp_tables: 1 Tmp_disk_tables: 0 Tmp_table_sizes: 1044720
# InnoDB_trx_id: 2A2544B6B
# QC_Hit: No Full_scan: Yes Full_join: No Tmp_table: Yes Tmp_table_on_disk: No
# Filesort: Yes Filesort_on_disk: No Merge_passes: 0
use db;
SET timestamp=1332597874;

and that timestamp maps to Sat, 24 Mar 2012 14:04:34 GMT, so it's not an old log by any means.

All of our databases are on the same point release and same rpm and show the same log format.

I downloaded the src.rpm from http://www.percona.com/downloads/Percona-Server-5.1/Percona-Server-5.1.56-12.7/source/ and grepped for the strings I'm seeing and they appear to be set.

kormoc@kormoc: ~/Downloads/test/Percona-Server-5.1.56 > grep -rin 'Tmp_table_on_disk' *
sql/log.cc:2428: "# QC_Hit: %s Full_scan: %s Full_join: %s Tmp_table: %s Tmp_table_on_disk: %s\n" \
sql/mysql_priv.h:735: { SLOG_F_TMP_DISK, "tmp_table_on_disk" },
sql/mysqld.cc:7076: "Log only the queries that followed certain execution plan. Multiple flags allowed in a comma-separated string. [qc_miss, full_scan, full_join, tmp_table, tmp_table_on_disk, filesort, filesort_on_disk]",

and there are no matches for the string pt-query-digest looks for

kormoc@kormoc: ~/Downloads/test/Percona-Server-5.1.56 > grep -rin 'Disk_tmp_table' *
kormoc@kormoc: ~/Downloads/test/Percona-Server-5.1.56 >

Revision history for this message
Rob Smith (kormoc) wrote :

I just verified that the latest 5.5 also appears to use Tmp_table_on_disk and not Disk_tmp_table. Is the pt-query-digest table definition just wrong?

Revision history for this message
Baron Schwartz (baron-xaprb) wrote :

You are right, the documentation for Percona Toolkit, which controls the schema of the table that's created, is probably the problem here. Facepalm!

Changed in percona-toolkit:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
Baron Schwartz (baron-xaprb) wrote :

Looks like the original problem is a buggy SlowLogWriter.pm that we then used to generate a bunch of wrong sample files, which are now well-established as test cases.

Revision history for this message
Brian Fraser (fraserbn) wrote :

( for reference, this actually came up before as https://bugs.launchpad.net/percona-toolkit/+bug/897927 )

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-502

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.