Comment 55 for bug 114505

Revision history for this message
cwsupport (netsupport) wrote :

I have track this down to the ifup script causing the issue.

S23ntp starts a process with pid 7083.

After start ps -ef shows ntpd with pid of 3628.
Immediately before this is 2939 udev and after this is 5502 portmap.

This shows just how early this ntp process is started.

The machines I run have

nsswitch.conf: hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4
resolv.conf: <EMPTY!>

All these machines are hard-wire static IP run a minimum of a caching DNS server. Therefore to resolve ANY external hosts NTPD *must* be started after bind9.

However the ifup ntpdate script completely undermines the rc2.d functionality.

Removing the /etc/network/if-up.d/ntpdate script completely removed the problem and everything operates as normal.

if-up scripts are also fundamentally flawed - the bringing up of the interface of a hard-wired network card will result in the interface coming UP *even* if the cable is unplugged. Subsequent plugging in of the cable will do nothing except the network will now work - which wont cause an event that restarts NTPD - again this is due to the flaw in NTPD that if it cannot resolve or cannot connect to a host it ignores the specified server.

As an interim fix these scripts should check to see if NTPD is running - if NOT then ntpd SHOULD NOT be started. This then ensures correct functionality on both hard-wired and wireless systems.
------------------
Case 1: hardwired

udev brings up
if-up.d/ntpdate runs - no ntpd server is currently running so it does nothing.
S15bind9 comes up
S23ntp comes up - ntpdate-debian ive put in this script corrects the clock and NTPD runs.

Case 2: wireless

udev brings up eth0
if-up.d/ntpdate runs - no ntpd server is currently running so it does nothing.
S15bind9 comes up
S23ntp comes up ntpdate-debian fails but the server runs.

networkmanager subsequently brings up ath0 and connects.
(i believe) if-up.d/ntpdate now reruns - pulls down NTPD, runs ntpdate (no longer required if you have ntpdate-debian in the init.d script which is more sensible) and then brings up NTPD again because *it was running* prior to the i/face coming up.
-----------------

I believe the following fixes are required - 2 solutions offered:

1. Best Solution
a) calls to ntpdate-debian should set the hwclock as discussed above.
b) ntpd if corrected to behave appropriately (panic threshold and server connect retries)

2. Interim Solution
a) calls to ntpdate-debian should set the hwclock as discussed above.
b) only pull up the ntpd server (or cal ntpdate even) in if-up scripts if it was previously running. After all - ntpdate should only be used automatically if ntp as a service has been started. If the service is never started the admin is stating that it does not want the system to automatically modify the date - merely having the s/w installed should not cause it to modify the clock automatically - its needs to be installed and the service started.