broken start-up dependencies for ntp (starts before NIS is available)

Bug #999725 reported by Paul Crawford
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
ntp (Ubuntu)
Invalid
Medium
Unassigned

Bug Description

We recently installed 12.04 LTS (32-bit) with NIS authentication and found two bugs with ntp, the first was ntp was not installed (even though the clock manager allowed, and defaulted to, "internet time") and the second more serious bug reported here is we discovered that ntp is being started before the system is capable of resolving DNS, so only time servers given by IP address are found.
Those given by name (ntp0.dundee.ac.uk and similar) are not found and, as ntp never performs DNS lookup after initially starting, they remain unavailable until you manually issue the command "/etc/init.d/ntp restart".
This was not a problem with 10.04 LTS, and I also though upstart was supposed to handle such starting sequence dependencies?

$lsb_release -rd
Description: Ubuntu 12.04 LTS
Release: 12.04

$ apt-cache policy ntp
ntp:
  Installed: 1:4.2.6.p3+dfsg-1ubuntu3
  Candidate: 1:4.2.6.p3+dfsg-1ubuntu3
  Version table:
 *** 1:4.2.6.p3+dfsg-1ubuntu3 0
        500 http://gb.archive.ubuntu.com/ubuntu/ precise/main i386 Packages
        100 /var/lib/dpkg/status

We expected ntp to be started after the network interfaces were up and DNS look-up possible.

Revision history for this message
Scott Moser (smoser) wrote :

Hi,
  Could you tell us how networking is configured in your system? Ie, are you using network manager?

  ntp (or any rc-sysvinit script) should not start until all "static networking" is up. "static networking" means anything defined in /etc/network/interfaces, but does not include network manager. I would have thought that dns resolution should be ready at that point.

The problem description does seem to indicate that we should restart ntpd on network manager network attach, though.

Changed in ntp (Ubuntu):
importance: Undecided → Medium
status: New → Incomplete
Revision history for this message
Paul Crawford (psc-sat) wrote :

I am not 100% sure of how the network was configured (the guy who did it is away today) but can report that the contents of /etc/network/interfaces are:

# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0
iface eth0 inet static
 address 134.36.22.69
 netmask 255.255.255.0
 gateway 134.36.22.1

So it should be a simple static IP address configured there. While it should not matter, it is maybe worth mentioning that the PC has two Ethernet ports, though we are only using one of them for this.

Is there a log somewhere showing the start-up order actually used when it boots up?

Revision history for this message
Nathan Stratton Treadway (nathanst) wrote : Re: [Bug 999725] Re: broken start-up dependencies for ntp

On Thu, May 17, 2012 at 16:10:39 -0000, Paul Crawford wrote:
> # The primary network interface
> auto eth0
> iface eth0 inet static
> address 134.36.22.69
> netmask 255.255.255.0
> gateway 134.36.22.1

Since the resolvconf package is installed by default in Precise, you'd
normally need to have a "dns-nameservers" line in your "interfaces"
stanza in order for DNS resolution to work at all (given that you are
using a static configuration).

So, what toes /etc/resolv.conf contain now? Also, what does
"ls -l /etc/resolv.conf" show?

      Nathan

Revision history for this message
Paul Crawford (psc-sat) wrote : Re: broken start-up dependencies for ntp

Results for 12.04 machine are:
$ ls -l /etc/resolv.conf
lrwxrwxrwx 1 root root 29 Apr 30 17:39 /etc/resolv.conf -> ../run/resolvconf/resolv.conf

$ cat /etc/resolv.conf
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN

On another 10.04 machine I get a file, and its contents have our DNS servers listed.

But if resolv.conf is missing this, and we don't have any dns-nameservers in /etc/network/interfaces, then how is the machine getting DNS later when everything seems normal?

Revision history for this message
Nathan Stratton Treadway (nathanst) wrote : Re: [Bug 999725] Re: broken start-up dependencies for ntp

On Thu, May 17, 2012 at 16:46:15 -0000, Paul Crawford wrote:
> Results for 12.04 machine are:
> $ ls -l /etc/resolv.conf
> lrwxrwxrwx 1 root root 29 Apr 30 17:39 /etc/resolv.conf -> ../run/resolvconf/resolv.conf
>
> $ cat /etc/resolv.conf
> # Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
> # DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN

Yes, this confirms that resolvconf is install and active, but not
getting any DNS configuration information.

>
> On another 10.04 machine I get a file, and its contents have our DNS
> servers listed.

Yes, the resolvconf package wasn't (generally) using in Lucid, so you
probably have a "static" resolv.conf file to go along with a static
network interface definition. (You can check by seeing if "ls -l
/etc/resolv.conf" shows a normal file, and has a modification date from
a while ago.)

>
> But if resolv.conf is missing this, and we don't have any dns-
> nameservers in /etc/network/interfaces, then how is the machine getting
> DNS later when everything seems normal?

Yes, that's definitely a key question...

What happens if you try "ping ntp0.dundee.ac.uk" from that box? (It
doesn't matter if the ping itself actually succeeds, but the question is
whether it can resolve the name to an IP number.) How about "host
ntp0.dundee.ac.uk"?

      Nathan

Revision history for this message
Paul Crawford (psc-sat) wrote : Re: broken start-up dependencies for ntp

It is somewhat odd, as I get this:

$ ping ntp0.dundee.ac.uk
PING ntp0.dundee.ac.uk (172.30.254.253) 56(84) bytes of data.
64 bytes from 443-gb-core-6513.private.dundee.ac.uk (172.30.254.253): icmp_req=1 ttl=254 time=0.281 ms
64 bytes from 443-gb-core-6513.private.dundee.ac.uk (172.30.254.253): icmp_req=2 ttl=254 time=0.303 ms
64 bytes from 443-gb-core-6513.private.dundee.ac.uk (172.30.254.253): icmp_req=3 ttl=254 time=0.332 ms
64 bytes from 443-gb-core-6513.private.dundee.ac.uk (172.30.254.253): icmp_req=4 ttl=254 time=0.376 ms
^C
--- ntp0.dundee.ac.uk ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3004ms
rtt min/avg/max/mdev = 0.281/0.323/0.376/0.035 ms

$ host ntp0.dundee.ac.uk

;; connection timed out; no servers could be reached

$ nslookup ntp0.dundee.ac.uk
;; connection timed out; no servers could be reached

So ping is able to perform the name-to-IP conversion fine, but host and nslookup both fail!
Other 'normal' programs seem to perform address lookup OK (e.g. entering "www.google.com" in firefox, or even ntp if restarted later) so there is something bizarre about the network management.

Revision history for this message
Nathan Stratton Treadway (nathanst) wrote : Re: [Bug 999725] Re: broken start-up dependencies for ntp

On Thu, May 17, 2012 at 18:43:59 -0000, Paul Crawford wrote:
> So ping is able to perform the name-to-IP conversion fine, but host
> and nslookup both fail!

Right, host and nslookup both (attempt to) do DNS queries directly,
while ping does the lookup using libc6 library routines...

So, what do you get from:

  $ ls -l /etc/nsswitch.conf
  $ cat /etc/nsswitch.conf

(Also, does /etc/hosts contain anything besides the default lines?)

> Other 'normal' programs seem to perform address lookup OK (e.g.
> entering "www.google.com" in firefox, or even ntp if restarted later)
> so there is something bizarre about the network management.

You mentioned earlier that you had NIS installed on this machine, so I'm
guessing the behavior you are seeing is related to that, but I'm not
personally very familiar with using NIS for "host" information.....

     Nathan

Revision history for this message
Paul Crawford (psc-sat) wrote : Re: broken start-up dependencies for ntp

These are the nsswitch results:

------------------------------
$ ls -l /etc/nsswitch.conf
-rw-r--r-- 1 root root 600 May 1 11:28 /etc/nsswitch.conf

------------------------------
$ cat /etc/nsswitch.conf
# /etc/nsswitch.conf
#
# Example configuration of GNU Name Service Switch functionality.
# If you have the `glibc-doc-reference' and `info' packages installed, try:
# `info libc "Name Service Switch"' for information about this file.

passwd: files nis compat
group: files nis compat
shadow: files nis compat

hosts: files nis dns
#hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4
networks: files

protocols: db files
services: db files
ethers: db files
rpc: db files

netgroup: nis
automount: files nis

------------------------------
$ cat /etc/hosts
127.0.0.1 localhost
127.0.1.1 terra.sat.dundee.ac.uk terra

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

I guess the 'hosts' entry was edited/replaced to add NIS, and also the netgroup and automount were added. I presume this is due to NIS not doing this sort of thing automatically when you install it (it asks for the domain, I think). I don't really understand NIS, and the guy usually responsible for this sort of thing is away, but as far as I know it only provides local-area user/machine authentication and so I would be surprised if it 'knows' about anything outside of our sub-domain (like google, or even the other university machines as they are not part of our NIS set-up).

Revision history for this message
Nathan Stratton Treadway (nathanst) wrote : Re: [Bug 999725] Re: broken start-up dependencies for ntp

On Thu, May 17, 2012 at 19:33:37 -0000, Paul Crawford wrote:
> $ cat /etc/nsswitch.conf
[...]
> hosts: files nis dns

> domain, I think). I don't really understand NIS, and the
> guy usually responsible for this sort of thing is away,
> but as far as I know it only provides local-area
> user/machine authentication and so I would be surprised
> if it 'knows' about anything outside of our sub-domain
> (like google, or even the other university machines as
> they are not part of our NIS set-up).

Yes, I would also have assumed that NIS wouldn't know
anything about google.com or other names, but given that
/etc/hosts is empty and the contents of the nsswitch.conf
hosts line, I can't think of any other place that host-name
information would be coming from...

Anyway, back to the question of getting ntpd working at
boot time:

Given that it seems your system does currently require NIS
to get host information, it makes sense that ntpd would
fail if it started before NIS was up.

While I don't have NIS installed anywhere, when I browse
the package source code it appears that there is not a
direct dependency between ntpd and NIS startup in the boot
scripts. (NIS is brought up via Upstart, while ntpd is
brought up via the /etc/rc*.d/S*ntp script.)

So I'm pretty sure that does explaion why you have problems
with ntpd at first but it works if you restart it later
(since by that point the NIS servers are running.)

However, based on what you said about the /etc/resolv.conf
on your Lucid machine, it sounds like your site does have
normal DNS name resolution available. If that's true, then
I believe adding that information to your eth0 stanza in
/etc/network/interfaces would allow DNS-based name
resolution to work as soon as that interface is brought up
-- and since the /etc/rc*.d scripts aren't run until static
networking is up, that should mean that DNS would be
available by the time ntpd started.

(See the "ifup" sections of "man resolvconf" and
/usr/share/doc/resolvconf/README.gz for more info on adding
that info to the interfaces file.)

Since your nsswitch.conf "hosts" line does include "dns",
presumably ntpd will then be able to successfully look up
the ntp-server names, even if NIS isn't yet running at that
point in the booting process.

If that isn't a viable work-around, then hopefully someone
with more Upstart knowledge will be able to suggest the
proper way to resolve this NIS v.s. ntpd start-up
dependency issue....

     Nathan

Revision history for this message
Paul Crawford (psc-sat) wrote : Re: broken start-up dependencies for ntp

Yes, it looks very much like the DNS system is broken here, but when I tried to look things up I get Bug #1001189 so overall not impressed with 12.04 so far :(

Still, adding some "dns-nameservers" lines to /etc/network/interfaces is the next obvious thing to try.

Revision history for this message
Paul Crawford (psc-sat) wrote :

I think this bug should concentrate on the key issue: that ntp (and maybe others?) is being brought up on the wrong event, that is it comes up with the interface, and not with the chosen type of name server.

In our case NIS provides user and name server resolution, and ntp comes up before it with 12.04

I don't know how LDAP is handled, but from the above comments it would appear be have the same problem, and so ntp is not currently able to resolve machines given only by NIS (or LDAP) name if they are not in the DNS (which I guess might be common with a large private network behind NAT).

Revision history for this message
Nathan Stratton Treadway (nathanst) wrote : Re: [Bug 999725] Re: broken start-up dependencies for ntp

On Thu, May 17, 2012 at 19:33:37 -0000, Paul Crawford wrote:
> domain, I think). I don't really understand NIS, and the
> guy usually responsible for this sort of thing is away,
> but as far as I know it only provides local-area
> user/machine authentication and so I would be surprised
> if it 'knows' about anything outside of our sub-domain
> (like google, or even the other university machines as
> they are not part of our NIS set-up).

For what it's worth, I see that at least some NIS servers
do support behind-the-scenes DNS lookups within the "hosts"
map; see for example the "-n" option to FreeBSD's ypserv
command:
  http://www.gsp.com/cgi-bin/man.cgi?section=8&topic=ypserv#10

So presumably some such server is in use at your site. (As
far as I can tell, the NIS servers for Linux don't support
that function, so I assume your NIS server there is not
running Ubuntu...)

However, the advice I see on the web generally agrees that
this function is obsolete (since the nsswitch.conf file now
lets clients configure the NIS v.s. DNS issue directly), so
I wonder if your "NIS guy" actually intended for DNS
resolution to be left unconfigured on your Precise
system...?

      Nathan

Revision history for this message
Paul Crawford (psc-sat) wrote : Re: broken start-up dependencies for ntp

Yes, you are right in that our NIS servers are solaris boxes, and they do support behind-the-scenes DNS lookups as it turns out. It is also true that NIS is depreciated, though a lot of older installations like ours still use it, and for most machines DNS is available and will probably fix our specific issue.

But...that is still not an excuse for daemons (like ntp) that might depend on the directory service (NIS, LDAP, ActiveDirectory, etc) being started without a sensibly configured sequence/dependency!

Revision history for this message
Nathan Stratton Treadway (nathanst) wrote : Re: [Bug 999725] Re: broken start-up dependencies for ntp

On Fri, May 18, 2012 at 17:47:21 -0000, Paul Crawford wrote:
> I think this bug should concentrate on the key issue:
> that ntp (and maybe others?) is being brought up on the
> wrong event, that is it comes up with the interface, and
> not with the chosen type of name server.

More specifically, the ntp package has not been converted
to Upstart yet, so it just comes up as part of the
rc-sysvinit scripts. That is, ntpd's startup itself isn't
tied to any specific event(s) at all (though as Steve's
comment hinted at, the execution of the rc-sysvinit scripts
as a group is triggered by the "filesystem and
static-network-up" condition).

I'm not sure off hand how the decision is made whether to
convert a package such as ntp to Upstart... but I see a
couple other bugs open on the topic: LP #604717 , LP #913379

> In our case NIS provides user and name server resolution,
> and ntp comes up before it with 12.04

(As far as I can tell, the NIS and ntp start conditions are
the same in Lucid and Precise, so I wonder if the reason
you don't see this problem on your Lucid machine is that
DNS is configured there.)

> I don't know how LDAP is handled, but from the above
> comments it would appear be have the same problem, and so
> ntp is not currently able to resolve machines given only
> by NIS (or LDAP) name if they are not in the DNS (which I
> guess might be common with a large private network behind
> NAT).

One thing to note is that ntp does spawn a separate process
that continues to retry looking up host names until it
finds an answer, so normally it will recover gracefully if
the lookup fails when ntp first starts up but start to work
later on.

I'm not sure of the details of how that interacts with
NIS-based host resolution, but I suspect this resolver
process doesn't deal with the NIS-is-not-ready-yet
situation the same way it does for DNS.

Anyway, I suspect that it's pretty rare for a site to have
no DNS at all, and that's probably why this issue hasn't
shown up for other people.

(Also, I don't know if there's an automated way for the
system to detect that ntp needs NIS to be up, so probably
such a dependency wouldn't be found in a default
installation. But if ntp were converted to Upstart, it
would be much easier for the system administrator to add
that dependency manually....)

      Nathan

summary: - broken start-up dependencies for ntp
+ broken start-up dependencies for ntp (starts before NIS is available)
Revision history for this message
Nathan Stratton Treadway (nathanst) wrote :

> I'm not sure off hand how the decision is made whether to
> convert a package such as ntp to Upstart... but I see a
> couple other bugs open on the topic: LP #604717 , LP #913379

Sorry, should have written those bug references as: LP: #604717 , LP: #913379

Revision history for this message
Paul Crawford (psc-sat) wrote :

It is probably true that this has not been seen much as a bug due to DNS normally being available, hence NIS dependency (if present) being a secondary issue. However, we found that ntp did not recover by itself, so possibly it only tries to find the nameservers once, but will re-try for the time servers in ntp.conf

I don't think it is necessary to test for any direct dependency on NIS in ntp.conf (that could be tricky) but a more robust approach would be to start (or restart?) ntp if a directory service is started/restarted just to make sure any such dependency or time-server changes are resolved. I presume this is the sort of thing Upstart can do.

Incidentally it seems the last post for bug #604717 is spam, is there any way to get it deleted?

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for ntp (Ubuntu) because there has been no activity for 60 days.]

Changed in ntp (Ubuntu):
status: Incomplete → Expired
Revision history for this message
John Hupp (john.hupp) wrote :

I'm currently troubleshooting a problem with a Lubuntu Quantal setup with LTSP (terminal server network).

Most of the clients stop responding at a blank/black screen during bootup. But only after successfully PXE network booting, getting DHCP assignments, and beginning the process of booting from an NBD image on the server. I see the Lubuntu splash screen and then the blank/black screen.

I set up forwarding of the client syslog messages to the server, and the logs always end at a block of ntpd items, the last of which is "ntpd[1314]: Listening on routing socket on fd #24 for interface updates"

I'm using network-manager for the network configuration. This is the default config, except that I am using the experimental statement in /etc/dnsmasq.d/network-manager which replaces the "bind-interfaces" line with a "bind-dynamic" line. That solved a client boot error "PXE-E32: TFTP open timeout" on the only client that currently boots successfully.

This bug seems to have some behavior in common with Bug # 959037 at https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/959037 which also addresses DNS-related bootup problems.

Comments? Troubleshooting? Workarounds?

Revision history for this message
Thomas Hood (jdthood) wrote :

@John: Does the affected machine have "nis" on the "hosts" line in /etc/nsswitch.conf? If not then your problem is not the same as this one.

Changed in ntp (Ubuntu):
status: Expired → New
Revision history for this message
John Hupp (john.hupp) wrote :

No, nis is not on hosts line of nsswitch.conf. I agree that my case is not the same as the one reported.

But since the problem here is reported to be one in which ntp is being started before the system is capable of resolving DNS, I couldn't help but notice the similarity to my previous problem in which dnsmasq was starting before the needed network interface was configured, and in the default configuration without the experimental dnsmasq configuration, it did not remedy that once the network interface finally was configured.

So I wondered if something similar was in play with my new problem.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in ntp (Ubuntu):
status: New → Confirmed
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

I know it's a long time, but I'm cleaning up old NTP bugs atm.

We haven't had any similar reports since then and popularity of nis declined since then.
Also even basic time is - without further conf - handled by systemd now which has totally different dependencies for all of this.
Finally this was only accidentally reopened - but discussion clarified that the second report was not the same bug.
Setting invalid since I can't manually set expired.

Changed in ntp (Ubuntu):
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.