oops using openvswitch gre tunnels with upstream commit 703133de in kernel

Bug #1262692 reported by Chris J Arges
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
openvswitch (Ubuntu)
Fix Released
Undecided
Unassigned
Precise
Fix Released
Medium
Chris J Arges
Quantal
Fix Released
Medium
Chris J Arges

Bug Description

1) This affects Precise w/ 3.5 HWE kernel and Quantal series.
This is using the openvswitch 1.4.6 package currently in proposed.

This also affects ovs p/q using any of the Ubuntu kernels since they all contain upstream commit 703133de.
e81a8144 -> ubuntu-precise
77dc64ef -> ubuntu-quantal
..

2)
ii openvswitch-common 1.4.6-0ubuntu1.12.10.1 amd64 Open vSwitch common components
ii openvswitch-datapath-dkms 1.4.6-0ubuntu1.12.10.1 all Open vSwitch datapath module source - DKMS version
ii openvswitch-switch 1.4.6-0ubuntu1.12.10.1 amd64 Open vSwitch switch implementations
ii openvswitch-test 1.4.6-0ubuntu1.12.10.1 all Open vSwitch test package

3)
Expected that setting up a GRE tunnel won't cause an oops.

Test Case:
# Install above packages with affected versions.
# Ensure you have an isolated network setup for eth1 between vms
# Obviously replace IPs with what's appropriate on your network
ovs-vsctl del-br integbr
ovs-vsctl add-br integbr
ifconfig eth1 192.168.234.21 netmask 255.255.255.0
ovs-vsctl add-port integbr gre0 -- set interface gre0 type=gre options:remote_ip=192.168.234.20
ovs-vsctl add-port integbr tsp0 -- set interface tsp0 type=internal
ifconfig tsp0 192.168.15.2 netmask 255.255.255.0
iptables -F
iptables -F -t nat
# goes boom here

4)
This is the bt from crash:

PID: 1197 TASK: ffff88007a009700 CPU: 0 COMMAND: "ovs-vswitchd"
 #0 [ffff8800799213f0] machine_kexec at ffffffff8103a275
 #1 [ffff880079921460] crash_kexec at ffffffff810b9d68
 #2 [ffff880079921530] oops_end at ffffffff81685600
 #3 [ffff880079921560] no_context at ffffffff81676637
 #4 [ffff8800799215c0] __bad_area_nosemaphore at ffffffff81676822
 #5 [ffff880079921610] bad_area_nosemaphore at ffffffff81676854
 #6 [ffff880079921620] do_page_fault at ffffffff816881bb
 #7 [ffff880079921730] do_async_page_fault at ffffffff81687a35
 #8 [ffff880079921750] async_page_fault at ffffffff81684a95
 #9 [ffff8800799218b0] ovs_vport_send at ffffffffa01b956e [openvswitch]
#10 [ffff8800799218d0] do_output at ffffffffa01b0214 [openvswitch]
#11 [ffff8800799218e0] do_execute_actions at ffffffffa01b031f [openvswitch]
#12 [ffff880079921970] ovs_execute_actions at ffffffffa01b0a68 [openvswitch]
#13 [ffff8800799219b0] ovs_packet_cmd_execute at ffffffffa01b1279 [openvswitch]
#14 [ffff880079921a10] genl_rcv_msg at ffffffff8159c9b0
#15 [ffff880079921aa0] netlink_rcv_skb at ffffffff8159c2d1
#16 [ffff880079921ad0] genl_rcv at ffffffff8159c745
#17 [ffff880079921af0] netlink_unicast at ffffffff8159bc2d
#18 [ffff880079921b40] netlink_sendmsg at ffffffff8159bf8a
#19 [ffff880079921bd0] sock_sendmsg at ffffffff8155c838
#20 [ffff880079921d50] ___sys_sendmsg at ffffffff8155cc81
#21 [ffff880079921f00] __sys_sendmsg at ffffffff8155ea09
#22 [ffff880079921f70] sys_sendmsg at ffffffff8155ea62
#23 [ffff880079921f80] system_call_fastpath at ffffffff8168c5a9
    RIP: 00007f4b2bfee570 RSP: 00007fff3cfd08e8 RFLAGS: 00000206
    RAX: 000000000000002e RBX: ffffffff8168c5a9 RCX: ffffffffffffffff
    RDX: 0000000000000000 RSI: 00007fff3cfcd910 RDI: 0000000000000025
    RBP: 00007f4b2f2a9e40 R8: 00007f4b2f2ab2c0 R9: 00000000000002ff
    R10: 1600000000000000 R11: 0000000000000246 R12: ffffffff8155ea62
    R13: ffff880079921f78 R14: 00007fff3cfcd910 R15: 0000000000000002
    ORIG_RAX: 000000000000002e CS: 0033 SS: 002b

Chris J Arges (arges)
Changed in openvswitch (Ubuntu):
assignee: nobody → Chris J Arges (arges)
status: New → In Progress
Revision history for this message
Chris J Arges (arges) wrote :

If I revert 77dc64ef531cfadd7d1d148f9a754e7a7c87c1f2 (703133de331a7a7df47f31fb9de51dc6f68a9de8 upstream) from the 3.5 series Ubuntu kernel the problem disappears. Will need to investigate further if this is something that should be solved in the ovs module or kernel side.

tags: added: bot-stop-nagging
Changed in linux (Ubuntu):
assignee: nobody → Chris J Arges (arges)
importance: Undecided → Medium
status: New → In Progress
Revision history for this message
Chris J Arges (arges) wrote :

I need to find if the problem is with linux or openvswitch.

Revision history for this message
Chris J Arges (arges) wrote :

Ok so the problem is that Linux changed the prototype of ip_select_indent. We can fix this on the ovs side using the following:

diff --git a/datapath/tunnel.c b/datapath/tunnel.c
index 1ebbc77..d62e1ea 100644
--- a/datapath/tunnel.c
+++ b/datapath/tunnel.c
@@ -1294,8 +1294,14 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
                iph->tos = tos;
                iph->ttl = ttl;
                iph->frag_off = frag_off;
- ip_select_ident(iph, &rt_dst(rt), NULL);

+ /* linux commit 703133de changes the interface of
+ ip_select_ident from iph to skb */
+#if LINUX_VERSION_CODE >= KERNEL_VERSION(3,2,0)
+ ip_select_ident(skb, &rt_dst(rt), NULL);
+#else
+ ip_select_ident(iph, &rt_dst(rt), NULL);
+#endif
                skb = tnl_vport->tnl_ops->update_header(vport, mutable,
                                                        &rt_dst(rt), skb);
                if (unlikely(!skb))

So we just need to check if the kernel version has this header and switch accordingly. The version I have above isn't exactly correct since there are differences between linux upstream and stable.

Revision history for this message
Chris J Arges (arges) wrote :

This commit is present in the latest 3.2/3.5 kernels for Ubuntu, therefore I'll ifdef it for 3,2,0.

Chris J Arges (arges)
no longer affects: linux (Ubuntu)
Changed in openvswitch (Ubuntu Precise):
assignee: nobody → Chris J Arges (arges)
Changed in openvswitch (Ubuntu Quantal):
assignee: nobody → Chris J Arges (arges)
Changed in openvswitch (Ubuntu Precise):
importance: Undecided → Medium
Changed in openvswitch (Ubuntu Quantal):
importance: Undecided → High
importance: High → Medium
Changed in openvswitch (Ubuntu Precise):
status: New → In Progress
Changed in openvswitch (Ubuntu Quantal):
status: New → In Progress
Changed in openvswitch (Ubuntu):
assignee: Chris J Arges (arges) → nobody
importance: Medium → Undecided
status: In Progress → Fix Released
Changed in openvswitch (Ubuntu Raring):
assignee: nobody → Chris J Arges (arges)
importance: Undecided → Medium
status: New → In Progress
Chris J Arges (arges)
summary: - oops using openvswitch 1.4.6 with 3.5 series kernel
+ oops using openvswitch gre tunnels with upstream commit 703133de in
+ kernel
Changed in openvswitch (Ubuntu Saucy):
status: New → Fix Released
Revision history for this message
Chris J Arges (arges) wrote :
description: updated
no longer affects: openvswitch (Ubuntu Raring)
no longer affects: openvswitch (Ubuntu Saucy)
description: updated
Revision history for this message
Chris J Arges (arges) wrote :
Revision history for this message
Chris J Arges (arges) wrote :

Fixes pushed for P/Q.

Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Chris, or anyone else affected,

Accepted openvswitch into quantal-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/openvswitch/1.4.6-0ubuntu1.12.10.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in openvswitch (Ubuntu Quantal):
status: In Progress → Fix Committed
tags: added: verification-needed
Changed in openvswitch (Ubuntu Precise):
status: In Progress → Fix Committed
Revision history for this message
Brian Murray (brian-murray) wrote :

Hello Chris, or anyone else affected,

Accepted openvswitch into precise-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/openvswitch/1.4.6-0ubuntu1.12.04.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Revision history for this message
Ante Karamatić (ivoks) wrote :

I can confirm proposed solution fixes the problem.

tags: added: verification-done
removed: verification-needed
Revision history for this message
Giovanni Meo (giovannimeo) wrote :

Patch works for me too. Thanks for it.

Chris J Arges (arges)
tags: added: verification-done-precise verification-done-quantal
removed: verification-done
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package openvswitch - 1.4.6-0ubuntu1.12.04.2

---------------
openvswitch (1.4.6-0ubuntu1.12.04.2) precise; urgency=low

  * d/p/0010-datapath_ip_select_ident_fix.patch: linux commit 703133de changes
    the first parameter of ip_select_ident from iph to skb (LP: #1262692)
 -- Chris J Arges <email address hidden> Wed, 08 Jan 2014 13:36:14 -0600

Changed in openvswitch (Ubuntu Precise):
status: Fix Committed → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote : Update Released

The verification of the Stable Release Update for openvswitch has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package openvswitch - 1.4.6-0ubuntu1.12.10.2

---------------
openvswitch (1.4.6-0ubuntu1.12.10.2) quantal; urgency=low

  * d/p/0010-datapath_ip_select_ident_fix.patch: linux commit 703133de changes
    the first parameter of ip_select_ident from iph to skb (LP: #1262692)
 -- Chris J Arges <email address hidden> Wed, 08 Jan 2014 13:54:40 -0600

Changed in openvswitch (Ubuntu Quantal):
status: Fix Committed → Fix Released
Revision history for this message
Hua Zhang (zhhuabj) wrote :

chris, can this patch work on Linux ubuntu1204 3.2.0-23-generic now ? this is a similar issue https://bugs.launchpad.net/ubuntu-advantage/+bug/1341626

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.