server won't start when flush_method=ALL_O_DIRECT and track_changed_pages=TRUE

Bug #1131949 reported by George Ormond Lorch III
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Fix Released
High
George Ormond Lorch III
5.1
Fix Released
High
Laurynas Biveinis
5.5
Fix Released
High
Laurynas Biveinis
5.6
Fix Released
High
George Ormond Lorch III

Bug Description

Server fails to start with:
InnoDB: Error: tried to read 65536 bytes at offset 0 2048.
InnoDB: Was only able to read 0.

my.cnf:
[client]
port=10031
socket=/data/dev/allo/mysql.sock
[mysqld]
basedir=/data/bin/ps-5.5-16274-mk3
datadir=/data/dev/allo/data
tmpdir=/data/dev/allo/tmp
port=10031
socket=/data/dev/allo/mysql.sock
pid-file=/data/dev/allo/mysql.pid
console
user=root
server-id=1
innodb_track_changed_pages=TRUE
innodb_flush_method=ALL_O_DIRECT

Using branch:
  parent branch: bzr+ssh://bazaar.launchpad.net/~laurynas-biveinis/percona-server/QA-16274-5.5-3/

Related branches

Revision history for this message
George Ormond Lorch III (gl-az) wrote :
tags: added: bitmap xtradb
tags: added: qa
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

George -

I cannot reproduce this locally. Could you please
1) attach the full error log for the failed run;
2) set a breakpoint at these InnoDB diagnostics you get and provide a stacktrace leading either to the diagnostics or the failing call itself?

Revision history for this message
George Ormond Lorch III (gl-az) wrote : Re: [Bug 1131949] Re: server won't start when flush_method=ALL_O_DIRECT and track_changed_pages=TRUE

Sure Laurynas, in fact, I'll even go a little further and see if I can
debug/fix it for you, should be pretty straightforward.

On 5/27/2013 10:50 AM, Laurynas Biveinis wrote:
> George -
>
> I cannot reproduce this locally. Could you please
> 1) attach the full error log for the failed run;
> 2) set a breakpoint at these InnoDB diagnostics you get and provide a stacktrace leading either to the diagnostics or the failing call itself?
>
> ** Changed in: percona-server/5.1
> Status: In Progress => Triaged
>
> ** Changed in: percona-server/5.5
> Assignee: Laurynas Biveinis (laurynas-biveinis) => George Ormond Lorch III (gl-az)
>
> ** Changed in: percona-server/5.1
> Assignee: Laurynas Biveinis (laurynas-biveinis) => George Ormond Lorch III (gl-az)
>
> ** Changed in: percona-server/5.6
> Assignee: Laurynas Biveinis (laurynas-biveinis) => George Ormond Lorch III (gl-az)
>

--
George O. Lorch III
Software Engineer, Percona
+1-888-401-3401 x542 US/Arizona (GMT -7)
skype: george.ormond.lorch.iii

Revision history for this message
George Ormond Lorch III (gl-az) wrote :

Looks like it is unaligned i/o on log file

Backtrace:
(gdb) bt
#0 _os_file_read (file=12, buf=0x7fffa8966010, offset=2048, offset_high=0, n=65536, trx=0x0) at os/os0file.c:2525
#1 0x0000000000865fc0 in os_aio (type=10, mode=24, name=0x3a948b8 "./ib_logfile0", file=12, buf=0x7fffa8966010, offset=2048, offset_high=0, n=65536, message1=0x3a947b8, message2=0x0, space_id=4294967280, trx=0x0) at os/os0file.c:3823
#2 0x0000000000935d58 in _fil_io (type=10, sync=1, space_id=4294967280, zip_size=0, block_offset=0, byte_offset=2048, len=65536, buf=0x7fffa8966010, message=0x0, trx=0x0) at fil/fil0fil.c:5341
#3 0x0000000000853f7b in log_group_read_log_seg (type=98887331, buf=0x7fffa8966010 "", group=0x3a94948, start_lsn=8192, end_lsn=73728) at log/log0log.c:2326
#4 0x0000000000856504 in log_online_follow_log_seg (group=0x3a94948, block_start_lsn=8192, block_end_lsn=73728) at log/log0online.c:973
#5 0x0000000000856697 in log_online_follow_log_group (group=0x3a94948, contiguous_lsn=8192) at log/log0online.c:1042
#6 0x0000000000856ba9 in log_online_follow_redo_log () at log/log0online.c:1212
#7 0x00000000008b514d in srv_redo_log_follow_thread (arg=0x0) at srv/srv0srv.c:2751
#8 0x000000391b007851 in start_thread () from /lib64/libpthread.so.0
#9 0x000000391a8e811d in clone () from /lib64/libc.so.6

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

5.6 does not have the option to set O_DIRECT for log files. Will merge the fix of aligning the read buffer at OS_FILE_LOG_BLOCK_SIZE anyway, since that's what InnoDB does for the log write buffer.

Revision history for this message
Valerii Kravchuk (valerii-kravchuk) wrote :
Download full text (3.8 KiB)

Same happens with Percona Server 5.7.10-3 GA as soon as I have:

innodb_flush_method=ALL_O_DIRECT

in my.cnf:

[root@centos ~]# service mysql stop
Stopping mysqld: [ OK ]
[root@centos ~]# vi /etc/my.cnf
[root@centos ~]# service mysql start
Percona MySQL Daemon failed to start.
Starting mysqld: [FAILED]
[root@centos ~]# tail -100 /var/log/mysqld.log
...
2016-02-26T15:49:07.327649Z 0 [Note] /usr/sbin/mysqld: Shutdown complete

160226 17:49:07 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
160226 17:50:41 mysqld_safe Transparent huge pages are already set to: never.
160226 17:50:41 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
2016-02-26T15:50:42.607807Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2016-02-26T15:50:42.608882Z 0 [Warning] 'NO_ZERO_DATE', 'NO_ZERO_IN_DATE' and 'ERROR_FOR_DIVISION_BY_ZERO' sql modes should be used with strict mode. They will be merged with strict mode in a future release.
2016-02-26T15:50:42.609278Z 0 [Warning] 'NO_AUTO_CREATE_USER' sql mode was not set.
2016-02-26T15:50:42.622279Z 0 [Note] /usr/sbin/mysqld (mysqld 5.7.10-3-log) starting as process 2958 ...
2016-02-26T15:50:42.667392Z 0 [Warning] No argument was provided to --log-bin, and --log-bin-index was not used; so replication may break when this MySQL server acts as a master and has his hostname changed!! Please use '--log-bin=centos-bin' to avoid this problem.
2016-02-26T15:50:42.680642Z 0 [Note] InnoDB: innodb_empty_free_list_algorithm has been changed to legacy because of small buffer pool size. In order to use backoff, increase buffer pool at least up to 20MB.

2016-02-26T15:50:42.681648Z 0 [Note] InnoDB: PUNCH HOLE support available
2016-02-26T15:50:42.681862Z 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2016-02-26T15:50:42.681948Z 0 [Note] InnoDB: Uses event mutexes
2016-02-26T15:50:42.682280Z 0 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
2016-02-26T15:50:42.682376Z 0 [Note] InnoDB: Compressed tables use zlib 1.2.3
2016-02-26T15:50:42.682439Z 0 [Note] InnoDB: Using Linux native AIO
2016-02-26T15:50:42.684322Z 0 [Note] InnoDB: Number of pools: 1
2016-02-26T15:50:42.685172Z 0 [Note] InnoDB: Not using CPU crc32 instructions
2016-02-26T15:50:43.008759Z 0 [Note] InnoDB: Initializing buffer pool, total size = 5M, instances = 1, chunk size = 5M
2016-02-26T15:50:43.029818Z 0 [Note] InnoDB: Completed initialization of buffer pool
2016-02-26T15:50:43.085462Z 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2016-02-26T15:50:43.645092Z 0 [Note] InnoDB: Highest supported file format is Barracuda.
2016-02-26T15:50:43.651811Z 0 [Warning] InnoDB: Retry attempts for reading partial data failed.
2016-02-26T15:50:43.652176Z 0 [ERROR] InnoDB: Tried to read 2048 bytes at offset 0 was only able to read0
2016-02-26T15:50:43.652257Z 0 [ERROR] InnoDB: Operating system error number 22 in a file o...

Read more...

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

This has the same root cause as bug 1529885, and the fix will be tracked there.

tags: added: i64171
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-638

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.