ext4 random block I/O write performance regression with 3.11 Saucy Kernel

Bug #1242812 reported by Colin Ian King
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Colin Ian King

Bug Description

Commit e7ea81db5(ext4: restructure writeback path) introduced a performance regression with random writes. Using tools such as bonnie++, writes with dd or even stress testing with tools such as 'stress' one can observe a 10%-20% performance regression.

SRU Justification:

Commit e7ea81db5(ext4: restructure writeback path) introduced a performance regression with random writes. Using tools such as bonnie++, writes with dd or even stress testing with tools such as 'stress' one can observe a 10%-20% performance regression.

Impact:

Write performance is diminished causing a noticeable regression compared to previous released kernels.

Fix:

Two patches are required:

a) upstream fix 9c12a83 which fixes the overly aggressive writing back of pages which ultimulately resulted in more seeking and
less performance.

b) commit aeac589a7 from the dev branch of kernel/git/tytso/ext4.git which ensures no more pages than nr_to_write can be added to the extent for mapping.

Testcase:

Using stress-ng on a 2 CPU machine run:

stress-ng --hdd 2 --hdd-ops 200000

(see: git://kernel.ubuntu.com/cking/stress-ng.git)

With the fix, this consistently runs ~10-20% faster than the non-fixed kernel.

Changed in linux (Ubuntu):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Colin King (colin-king)
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

bug 1242085 may be a duplicate of this bug.

tags: added: kernel-da-key saucy
Changed in linux (Ubuntu):
status: In Progress → Fix Committed
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-saucy' to 'verification-done-saucy'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-saucy
Revision history for this message
Colin Ian King (colin-king) wrote :

I've installed the -proposed kernel and ran 10 iterations of a block write soak test using stress-ng and measured the duration to perform 200000 I/O operations:

 3.11.0-13-generic #20-Ubuntu : 127.6 seconds

 3.11.0-14-generic #21-Ubuntu (-proposed): 78.5 seconds

so the proposed kernel does radically improve block I/O write performance as expected.

I have also thrashed ext4 with multiple kernel builds for a few hours and ext4 performs without any observable faults. I deem this verified and passed.

tags: added: verification-done-saucy
removed: verification-needed-saucy
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (5.6 KiB)

This bug was fixed in the package linux - 3.11.0-14.21

---------------
linux (3.11.0-14.21) saucy; urgency=low

  [Brad Figg]

  * Release Tracking Bug
    - LP: #1250540

  [ Anthony Wong ]

  * SAUCE: Work around broken ACPI backlight on Dell Inspiron 5537
    - LP: #1231305

  [ Colin Ian King ]

  * SAUCE: eCryptfs: fix 32 bit corruption issue
    - LP: #1243636

  [ Ming Lei ]

  * SAUCE: ext4: fix performance regression in ext4_writepages
    - LP: #1242812

  [ Upstream Kernel Changes ]

  * Revert "bridge: only expire the mdb entry when query is received"
    - LP: #1249081
  * ext4: fix performance regression in writeback of random writes
    - LP: #1242812
  * be2net: pass if_id for v1 and V2 versions of TX_CREATE cmd
    - LP: #1234019
  * tcp: TSO packets automatic sizing
    - LP: #1249081
  * tcp: TSQ can use a dynamic limit
    - LP: #1249081
  * tcp: must unclone packets before mangling them
    - LP: #1249081
  * tcp: do not forget FIN in tcp_shifted_skb()
    - LP: #1249081
  * tcp: fix incorrect ca_state in tail loss probe
    - LP: #1249081
  * net: do not call sock_put() on TIMEWAIT sockets
    - LP: #1249081
  * batman-adv: set up network coding packet handlers during module init
    - LP: #1249081
  * l2tp: fix kernel panic when using IPv4-mapped IPv6 addresses
    - LP: #1249081
  * l2tp: Fix build warning with ipv6 disabled.
    - LP: #1249081
  * net: mv643xx_eth: update statistics timer from timer context only
    - LP: #1249081
  * net: mv643xx_eth: fix orphaned statistics timer crash
    - LP: #1249081
  * net: heap overflow in __audit_sockaddr()
    - LP: #1249081
  * sit: amend "allow to use rtnl ops on fb tunnel"
    - LP: #1249081
  * proc connector: fix info leaks
    - LP: #1249081
  * ipv4: fix ineffective source address selection
    - LP: #1249081
  * can: dev: fix nlmsg size calculation in can_get_size()
    - LP: #1249081
  * net: secure_seq: Fix warning when CONFIG_IPV6 and CONFIG_INET are not
    selected
    - LP: #1249081
  * xen-netback: Don't destroy the netdev until the vif is shut down
    - LP: #1249081
  * net/mlx4_en: Rename name of mlx4_en_rx_alloc members
    - LP: #1249081
  * net/mlx4_en: Fix pages never dma unmapped on rx
    - LP: #1249081
  * net: vlan: fix nlmsg size calculation in vlan_get_size()
    - LP: #1249081
  * bridge: update mdb expiration timer upon reports.
    - LP: #1249081
  * vti: get rid of nf mark rule in prerouting
    - LP: #1249081
  * l2tp: must disable bh before calling l2tp_xmit_skb()
    - LP: #1249081
  * netem: update backlog after drop
    - LP: #1249081
  * netem: free skb's in tree on reset
    - LP: #1249081
  * farsync: fix info leak in ioctl
    - LP: #1249081
  * unix_diag: fix info leak
    - LP: #1249081
  * connector: use nlmsg_len() to check message length
    - LP: #1249081
  * bnx2x: record rx queue for LRO packets
    - LP: #1249081
  * virtio-net: don't respond to cpu hotplug notifier if we're not ready
    - LP: #1249081
  * virtio-net: refill only when device is up during setting queues
    - LP: #1249081
  * bridge: Correctly clamp MAX forward_delay when enabling STP
    - LP: #1249081
  * net: dst: provide accessor function to dst->xfrm
 ...

Read more...

Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
Daniel Lezcano (daniel-lezcano) wrote :

Is this fix also present in 'trusty' ?

Revision history for this message
Colin Ian King (colin-king) wrote :

@Daniel, these commits are present in Trusty.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.