Stub resolver cache is corrupted

Bug #1818527 reported by Abam
56
This bug affects 10 people
Affects Status Importance Assigned to Milestone
systemd (Ubuntu)
Fix Released
Undecided
Unassigned
Xenial
Invalid
Undecided
Unassigned
Bionic
Fix Released
Medium
Heitor Alves de Siqueira

Bug Description

[Impact]
systemd-resolved fails to resolve A records

[Description]
When systemd-resolve caches a non-existent CNAME record for a specific domain, further attempts at resolving A records for that same domain fail. This has been fixed upstream in v240.

Upstream commit: https://github.com/systemd/systemd/commit/3740146a4cbd

$ git describe --contains 3740146a4cbd
v240~839

$ rmadison systemd --arch amd64
 systemd | 229-4ubuntu4 | xenial | source, ...
 systemd | 229-4ubuntu21.21 | xenial-security | source, ...
 systemd | 229-4ubuntu21.21 | xenial-updates | source, ...
 systemd | 237-3ubuntu10 | bionic | source, ...
 systemd | 237-3ubuntu10.19 | bionic-security | source, ...
 systemd | 237-3ubuntu10.21 | bionic-updates | source, ...
 systemd | 237-3ubuntu10.22 | bionic-proposed | source, ...
 systemd | 239-7ubuntu10 | cosmic | source, ...
 systemd | 239-7ubuntu10.12 | cosmic-security | source, ...
 systemd | 239-7ubuntu10.13 | cosmic-updates | source, ...
 systemd | 239-7ubuntu10.14 | cosmic-proposed | source, ...
 systemd | 240-6ubuntu5 | disco | source, ...
 systemd | 240-6ubuntu5.1 | disco-proposed | source, ...
 systemd | 240-6ubuntu9 | eoan | source, ...

Despite the package versions above, only Bionic is affected. Cosmic already includes a backported fix, and Xenial doesn't seem affected due to resolvconf handling DNS resolution.

[Test Case]
Flush resolved's caches and try resolving a non-existent CNAME record. Further resolution attempts for the corresponding A record will fail:

#1
On a Bionic host:
$ systemd-resolve --flush-caches
$ dig github.com CNAME
....
;; QUESTION SECTION:
;github.com. IN CNAME

;; Query time: 47 msec
.....

$ dig github.com A
....
;; QUESTION SECTION:
;github.com. IN A

;; Query time: 0 msec
....

While in reality, if no non-existent CNAME result query has been made first:

$ systemd-resolve --flush-caches
$ dig github.com
....
; QUESTION SECTION:
;github.com. IN A

;; ANSWER SECTION:
github.com. 59 IN A 192.30.253.112

;; Query time: 51 msec
....

#2
On a Bionic host:
$ systemd-resolve --flush-caches
$ dig github.com CNAME
$ dig github.com A

Build a lxd container with Cosmic/Disco/Eoan (systemd-240):
$ lxc launch ubuntu:cosmic cosmiclxd
$ lxd exec cosmiclxd bash
$ dig github.com A
....
;; QUESTION SECTION:
;github.com. IN A

;; Query time: 0 msec
....

Despite the fact that Cosmic and late has the proper systemd fix, Cosmic/Disco/Eoan container can suffer from the bug too if the host is Bionic (container uses the host as a DNS resolver).

So you may face the problem inside Cosmic/Disco/Eoan container, but it's still the same Bionic systemd bug.

[Regression Potential]
The regression potential for this fix should be very low, as it's a direct cherry-pick from upstream systemd. It has seen extensive testing in both upstream and other Ubuntu releases, and was verified for Bionic through autopkgtests.

================================

[Original Description]

It seems that when systemd-resolve cache an non-existent CNAME record for a domain, any attempt to resolve A record for the same domain fail.

systemd version the issue has been seen with
Installed: 237-3ubuntu10.13
Used distribution

Distributor ID: Ubuntu
Description: Ubuntu 18.04.2 LTS
Release: 18.04
Codename: bionic

Expected behaviour you didn't see

Return A record for a domain when it exists.

Unexpected behaviour you saw

Resolution failed.

Steps to reproduce the problem

Whait for 1 minutes (github.com TTL for A record)

Try to resolv github.com CNAME record dig CNAME github.com

This will return an empty result.

Then try to resolve github.com A record dig A github.com.

This will now return empty result unless you restart systemd-resolved or wait for cache expiration.

At the same time using another DNS will resolve correctly dig A github.com @8.8.8.8.

Exemple :

Wait for 1 minutes to let cache expire, then run

dig CNAME github.com
dig A github.com
# no result
dig A github.com @8.8.8.8
# ;; ANSWER SECTION:
# github.com. 59 IN A 192.30.253.113
# github.com. 59 IN A 192.30.253.112

PS: Don't forget to restart systemd-resolve, before trying to post an answer.

This bug was first reported in github https://github.com/systemd/systemd/issues/11789 but systemd version in ubuntu is too old.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in systemd (Ubuntu):
status: New → Confirmed
Revision history for this message
Chiang Fong Lee (myself-a) wrote :

This seems to be the same issue previously reported at https://github.com/systemd/systemd/issues/9833, which was fixed in https://github.com/systemd/systemd/pull/9836 (commit 3740146) and released in v240.

Revision history for this message
Guillaume Belanger (g.belanger) wrote :

This bug also affects MaaS when commissioning is done with Bionic 18.04 and an external proxy is set.

Commissioning fails at the <Hardware Test> step inside the <smartctl-validate> script when trying to install packages for the test itself.

Error message:
Temporary failure resolving <External proxy>

Changed in systemd (Ubuntu Bionic):
status: New → Confirmed
Changed in systemd (Ubuntu):
status: Confirmed → Fix Released
Changed in systemd (Ubuntu Bionic):
assignee: nobody → Heitor Alves de Siqueira (halves)
Changed in systemd (Ubuntu Xenial):
assignee: nobody → Heitor Alves de Siqueira (halves)
Changed in systemd (Ubuntu Bionic):
importance: Undecided → Medium
Changed in systemd (Ubuntu Xenial):
importance: Undecided → Medium
Revision history for this message
Heitor Alves de Siqueira (halves) wrote :
description: updated
Changed in systemd (Ubuntu Xenial):
status: New → Invalid
importance: Medium → Undecided
assignee: Heitor Alves de Siqueira (halves) → nobody
Changed in systemd (Ubuntu Bionic):
status: Confirmed → In Progress
tags: added: sts sts-sponsor
Revision history for this message
Eric Desrochers (slashd) wrote :

[sts-sponsor]

There is an SRU in progress for systemd already for Bionic. It will have to wait for LP: #1814373 and #1825997 to be 'Fix Released' before sponsoring that particular bug.

Thanks Heitor for your contribution.

Let's circle back later.

- Eric

Revision history for this message
Dan Streetman (ddstreet) wrote :

I can include this in my next systemd upload, once the current -proposed versions are released.

tags: added: sts-sponsor-ddstreet
tags: added: ddstreet-next
Dan Streetman (ddstreet)
tags: added: systemd
Eric Desrochers (slashd)
description: updated
description: updated
description: updated
Dan Streetman (ddstreet)
tags: removed: ddstreet-next sts-sponsor-ddstreet
Revision history for this message
Eric Desrochers (slashd) wrote :

Sponsored in Bionic.

* The fix LGTM.
- Looking in systemd git upstream repo, I couldn't find anything (revert, known regression introduced by this particular fix or else)
- I easily reproduced the bug using systemd without the fix, and I confirm it works for both enumerated scenarios found in [Test Case] just fine[0] with the fix.
- This has been extensively tested in both upstream and other Debian/Ubuntu releases.

* Very minor modifications:
- Slightly modified the DEP3 header (Adding the upstream bug link, ....)
- Renamed the patch from "lp1818527-resolved-do-not-hit-CNAME-in-NODATA.patch" to "resolved-do-not-hit-CNAME-in-NODATA.patch" to stay consistent with current other "resolved" patch type.

Thanks Heitor for your contribution.

- Eric

[0] - Validation of the fix in Bionic:
# systemd-resolve --flush-caches
# dig github.com -t CNAME

; <<>> DiG 9.11.3-1ubuntu1.7-Ubuntu <<>> github.com -t CNAME
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 8781
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;github.com. IN CNAME

;; Query time: 22 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Tue Jun 11 15:43:51 EDT 2019
;; MSG SIZE rcvd: 39

# dig github.com

; <<>> DiG 9.11.3-1ubuntu1.7-Ubuntu <<>> github.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2811
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;github.com. IN A

;; ANSWER SECTION:
github.com. 42 IN A 192.30.253.113

;; Query time: 13 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Tue Jun 11 15:43:55 EDT 2019
;; MSG SIZE rcvd: 55

Eric Desrochers (slashd)
description: updated
description: updated
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Please test proposed package

Hello Abam, or anyone else affected,

Accepted systemd into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/237-3ubuntu10.23 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in systemd (Ubuntu Bionic):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-bionic
Revision history for this message
Heitor Alves de Siqueira (halves) wrote :

Verified for Bionic:

ubuntu@bionic:~$ dpkg -l | grep systemd
ii systemd 237-3ubuntu10.23 amd64 system and service manager

ubuntu@bionic:~$ systemd-resolve --flush-caches

ubuntu@bionic:~$ dig +noall +answer github.com CNAME

ubuntu@bionic:~$ dig +noall +answer github.com A
github.com. 18 IN A 140.82.118.4

tags: added: verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Heitor Alves de Siqueira (halves) wrote :

Some comments on the Autopkgtest regressions:

- lxc (armhf) fails due to the ftpmaster.internal mirror failing

- libvirt (armhf) fails for the same reason ("Unable to connect to ftpmaster.internal:http")

- systemd (ppc64el) has been failing since before this patch was introduced, and looking at the test logs it doesn't seem to be related to the stub resolver

- network-manager (arm64) has been failing since previous systemd versions (since systemd/237-3ubuntu10.21)

- gvfs (arm64) fails due to a permission error that should be unrelated to the stub resolver ("GLib.Error('Not authorized to perform ope[83 chars]', 0) != True")

- gvfs (i386) has been failing since before this patch was introduced

After going through the autopkgtest logs for the above, it seems that the failures are either due to autopkgtest infra, or have been introduced by something other than the systemd upload.

tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 237-3ubuntu10.23

---------------
systemd (237-3ubuntu10.23) bionic; urgency=medium

  * d/p/resolved-do-not-hit-CNAME-in-NODATA.patch:
    - fix stub resolver cache (LP: #1818527)

 -- Heitor Alves de Siqueira <email address hidden> Tue, 04 Jun 2019 15:54:24 -0300

Changed in systemd (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Steve Langasek (vorlon) wrote : Update Released

The verification of the Stable Release Update for systemd has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.