Merge lp:~qzhang/lava-dispatcher/fix-network-timeout into lp:lava-dispatcher

Proposed by Spring Zhang
Status: Rejected
Rejected by: Spring Zhang
Proposed branch: lp:~qzhang/lava-dispatcher/fix-network-timeout
Merge into: lp:lava-dispatcher
Diff against target: 14 lines (+2/-1)
1 file modified
lava_dispatcher/client.py (+2/-1)
To merge this branch: bzr merge lp:~qzhang/lava-dispatcher/fix-network-timeout
Reviewer Review Type Date Requested Status
Spring Zhang (community) Disapprove
Paul Larson (community) Needs Fixing
Review via email: mp+68774@code.launchpad.net

Description of the change

the way timeout working currently seems not work for some boards, make it a sleep to get timeout, and decrease the detection time to half of timeout.

To post a comment you must log in.
Revision history for this message
Paul Larson (pwlars) wrote :

Can you explain more about what you are trying to accomplish here? This patch would seem to make it (assuming default timeout of 120 seconds):
1. Sleep for 120 seconds
2. check to see if the network is up
3. repeat #2 until it's up, or until 60 seconds have gone by.

Basically this is just going to cause it to wait 2 minutes before even checking to see if the network is alive, then scan it for half that time. I don't think that's what we want to do here.

review: Needs Fixing
Revision history for this message
Spring Zhang (qzhang) wrote :

> Can you explain more about what you are trying to accomplish here? This patch
> would seem to make it (assuming default timeout of 120 seconds):
> 1. Sleep for 120 seconds
> 2. check to see if the network is up
> 3. repeat #2 until it's up, or until 60 seconds have gone by.
>
> Basically this is just going to cause it to wait 2 minutes before even
> checking to see if the network is alive, then scan it for half that time. I
> don't think that's what we want to do here.
Please see the bug description, I find the timeout way used before can not get the timeout(default 120s) on my board, if the network is not probed up, it will ping several times and fail, but the time is far away reached the timeout. I made a patch to increase the timeout to be 120s, but now I find it can't work, so I think it's better to wait for the timeout(just sleep), and also use a former detection way with a smaller timeout(now I set to half)

Revision history for this message
Paul Larson (pwlars) wrote :

But the problem with this is that for some (most?) people, the board will acquire the address within seconds of booting. Waiting 2 minutes before even trying is going to be annoying, and useless in that case. Looking at the bug, I see 2 things that it seems are happening:
1. you are only getting 4 attempts in the 2 minutes it takes before timing out. This is unusual. How long is each attempt taking? It should probably fail out after a reasonable amount of time. For me, it takes about 3 seconds I think before ping exits if it doesn't get a response, and sometimes immediately depending on the situation. If this isn't happening for you, we should see why, or better yet just add -w5 or something so that it times out after 5 seconds regardless of what it has or hasn't received.
2. It's still not waiting long enough for you. So as I asked in the last merge proposal, how long does it reasonably take on your network, before you can be sure you'll get a response on the address??? I think 2 minutes is a fairly long amount of time, and we've pushed this timeout several times already. I'm tempted to say let's just make it 5 minutes, but I'm worried that still won't be long enough for your situation.

Let's find some reasonable timeout and just be done with this, but don't wait 2 minutes before even checking to see if the network is up.

Revision history for this message
Loïc Minier (lool) wrote :

So maybe this mp should be rejected?

Revision history for this message
Spring Zhang (qzhang) wrote :

I agree. It's more related to board issue.

review: Disapprove

Unmerged revisions

81. By Spring Zhang

make waiting for network up time a real timeout

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'lava_dispatcher/client.py'
2--- lava_dispatcher/client.py 2011-07-21 16:47:16 +0000
3+++ lava_dispatcher/client.py 2011-07-22 03:35:46 +0000
4@@ -120,8 +120,9 @@
5 return False
6
7 def wait_network_up(self, timeout=120):
8+ time.sleep(timeout)
9 now = time.time()
10- while time.time() < now+timeout:
11+ while time.time() < now+timeout/2:
12 if self.check_network_up():
13 return
14 raise NetworkError

Subscribers

People subscribed via source and target branches