Merge lp:~liuyq0307/lava-lab/keep-one-fastmodel-per-node into lp:lava-lab

Proposed by Yongqin Liu
Status: Needs review
Proposed branch: lp:~liuyq0307/lava-lab/keep-one-fastmodel-per-node
Merge into: lp:lava-lab
Diff against target: 81 lines (+15/-29)
7 files modified
lava/devices/fastmodels01/production/rtsm_ve-armv8_01.conf (+7/-0)
lava/devices/fastmodels01/production/rtsm_ve-armv8_03.conf (+0/-7)
lava/devices/fastmodels01/production/rtsm_ve-armv8_04.conf (+0/-7)
lava/devices/fastmodels02/production/rtsm_ve-armv8_01.conf (+0/-7)
lava/devices/fastmodels02/production/rtsm_ve-armv8_02.conf (+1/-1)
lava/devices/fastmodels03/production/rtsm_ve-armv8_03.conf (+7/-0)
lava/devices/fastmodels03/production/rtsm_ve-armv8_05.conf (+0/-7)
To merge this branch: bzr merge lp:~liuyq0307/lava-lab/keep-one-fastmodel-per-node
Reviewer Review Type Date Requested Status
Dave Pigott Needs Fixing
Amit Pundir Pending
Matthew Hart Pending
Review via email: mp+185963@code.launchpad.net

Description of the change

for fast model instance, we want to keep only one instance on each node,
but seems after lava-instance redeployment(or other case), the offline device will become online and will be assigned job to run.

so this change remove the unnecessary devices and keep only one fast model device on each node.

To post a comment you must log in.
Revision history for this message
Antonio Terceiro (terceiro) wrote :

Did you discuss this in advance with the lab guys (Dave & Matt)?

Revision history for this message
Yongqin Liu (liuyq0307) wrote :

> Did you discuss this in advance with the lab guys (Dave & Matt)?
not yet, did not catch up Matt yesterday.

And I think we can discuss it when he review this change.

Revision history for this message
Matthew Hart (matthew-hart) wrote :

This would take us from 5 armv8 fastmodels down to 3. That's fine but we need to remember there are currently 4 jobs running so it may start to cause a backlog.

Also, before this gets rolled out I need to configure the bridge on fastmodels01 (it's done, just needs rebooting when no jobs are running).

Revision history for this message
Yongqin Liu (liuyq0307) wrote :

> This would take us from 5 armv8 fastmodels down to 3.
This is that we expect
> That's fine but we need
> to remember there are currently 4 jobs running so it may start to cause a
> backlog.
I think we can cancel these 4 jobs, and resubmit them after we did the change.

To: Amit Pundir
how do you think about this, if it's OK to cancel the 4 jobs to have this change deployed?

Thanks,

Revision history for this message
Dave Pigott (dpigott) wrote :

YongQin - that's not the point. Those jobs take a long time. If we only have 3 fastmodels and 4 jobs, we will start building a backlog of jobs, and we won't be at the tip of testing.

Revision history for this message
Amit Pundir (pundiramit) wrote :

Dave, Matt,

I understand your concern but over the period of time we have found that running multiple fastmodels on a single machine put extra load on that machine. Random adb connection failures are also observed rather frequently on machines running multiple models.

It is "OK" to run regular kernel and system tests, which run for about 2-3 hours in total, on multiple models sharing a single machine but CTS tests (one of which run for 20 hours straight) are very prone to this extra load and they tend to reboot models when they timeout due to inactivity or just skip running remaining tests. We have observed this on local setups as well.

We just want to make sure that we run the tests (CTS and Monkey in particular) to completion. Ideally it would be great to have one more machine in the lab dedicated to fastmodels so that we don't have backlog of jobs.

Thanks.

Revision history for this message
Dave Pigott (dpigott) wrote :

So, previous to this change we had:

fastmodels01
  rtsm_foundation-armv8_01
  rtsm_ve-a15x1-a7x1_01
  rtsm_ve-a15x4-a7x4_01
  rtsm_ve-a15x4-a7x4_02
  rtsm_ve-armv8_03
  rtsm_ve-armv8_04
fastmodels02
  rtsm_ve-armv8_01
  rtsm_ve-armv8_02
fastmodels03
  rtsm_foundation-armv8_02
  rtsm_ve-armv8_05

And after this change we would have:

fastmodels01
  rtsm_foundation-armv8_01
  rtsm_ve-a15x1-a7x1_01
  rtsm_ve-a15x4-a7x4_01
  rtsm_ve-a15x4-a7x4_02
  rtsm_ve-armv8_01

fastmodels02
  rtsm_ve-armv8_02

fastmodels03
  rtsm_foundation-armv8_02
  rtsm_ve-armv8_03

So in fact, your change does not remove multiple fast models except on one server, fastmodels02.

Perhaps the solution is to have a new device type, rtsm_ve-armv8-single which is guaranteed to run on a single node, and only provide one of those.

My main issue here is that if we have to have a server per fast model instance, then to support the current load and spread, and future new fast model types, I need 10 servers, at $2000 each, one per fast model instance. That seems rather excessive.

You will need to get authorisation from Mark Orvek for that sort of spend.

review: Needs Fixing
Revision history for this message
Yongqin Liu (liuyq0307) wrote :

On 23 September 2013 20:59, Dave Pigott <email address hidden> wrote:

> Review: Needs Fixing
>
> So, previous to this change we had:
>
> fastmodels01
> rtsm_foundation-armv8_01
> rtsm_ve-a15x1-a7x1_01
> rtsm_ve-a15x4-a7x4_01
> rtsm_ve-a15x4-a7x4_02
> rtsm_ve-armv8_03
> rtsm_ve-armv8_04
> fastmodels02
> rtsm_ve-armv8_01
> rtsm_ve-armv8_02
> fastmodels03
> rtsm_foundation-armv8_02
> rtsm_ve-armv8_05
>
> And after this change we would have:
>
> fastmodels01
> rtsm_foundation-armv8_01
> rtsm_ve-a15x1-a7x1_01
> rtsm_ve-a15x4-a7x4_01
> rtsm_ve-a15x4-a7x4_02
> rtsm_ve-armv8_01
>
> fastmodels02
> rtsm_ve-armv8_02
>
> fastmodels03
> rtsm_foundation-armv8_02
> rtsm_ve-armv8_03
>
> So in fact, your change does not remove multiple fast models except on one
> server, fastmodels02.
>
yes, the aim of this change is to have only one rtsm_ve_armv8_0x device on
each node, and adjust the device name identity to the physical machine
node name.
and as far as I know, there is no android image will be tested on
rtsm_foundation-xxx devices, so it should be OK IMO.
for rtsm_ve-a15xxxx devices, we need to see how much effect they would
make, for now seems it's OK.
in the past time, when we have only rtsm_ve-armv8_03 or rtsm_ve-armv8_04
online, the test job works well.
(maybe the load caused by rtsm_ve-a15xxxx devices is not too large).

>
> Perhaps the solution is to have a new device type, rtsm_ve-armv8-single
> which is guaranteed to run on a single node, and only provide one of those.
>
My main issue here is that if we have to have a server per fast model
> instance, then to support the current load and spread, and future new fast
> model types, I need 10 servers, at $2000 each, one per fast model instance.
> That seems rather excessive.
>

For this part about if we should have a server per fast model instance, I
think it is under discussion now.

> You will need to get authorisation from Mark Orvek for that sort of spend.
>
> --
>
> https://code.launchpad.net/~liuyq0307/lava-lab/keep-one-fastmodel-per-node/+merge/185963
> You are the owner of lp:~liuyq0307/lava-lab/keep-one-fastmodel-per-node.
>

--
Thanks,
Yongqin Liu
---------------------------------------------------------------
#mailing list
<email address hidden> <email address hidden>
http://lists.linaro.org/mailman/listinfo/linaro-android
<email address hidden> <email address hidden>
http://lists.linaro.org/pipermail/linaro-validation

Unmerged revisions

316. By Yongqin Liu

clean the fast model device to keep only one instance on each node

315. By Yongqin Liu

clean the fast model device to keep only one instance on each node

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== added file 'lava/devices/fastmodels01/production/rtsm_ve-armv8_01.conf'
2--- lava/devices/fastmodels01/production/rtsm_ve-armv8_01.conf 1970-01-01 00:00:00 +0000
3+++ lava/devices/fastmodels01/production/rtsm_ve-armv8_01.conf 2013-09-17 02:07:10 +0000
4@@ -0,0 +1,7 @@
5+# DO NOT EDIT - MANAGED BY SALT!
6+
7+device_type = rtsm_ve-armv8
8+
9+license_file = "8224@control"
10+
11+interfaceName = armv8_01
12
13=== removed file 'lava/devices/fastmodels01/production/rtsm_ve-armv8_03.conf'
14--- lava/devices/fastmodels01/production/rtsm_ve-armv8_03.conf 2013-06-25 06:11:06 +0000
15+++ lava/devices/fastmodels01/production/rtsm_ve-armv8_03.conf 1970-01-01 00:00:00 +0000
16@@ -1,7 +0,0 @@
17-# DO NOT EDIT - MANAGED BY SALT!
18-
19-device_type = rtsm_ve-armv8
20-
21-license_file = "8224@control"
22-
23-interfaceName = armv8_01
24
25=== removed file 'lava/devices/fastmodels01/production/rtsm_ve-armv8_04.conf'
26--- lava/devices/fastmodels01/production/rtsm_ve-armv8_04.conf 2013-06-25 06:11:06 +0000
27+++ lava/devices/fastmodels01/production/rtsm_ve-armv8_04.conf 1970-01-01 00:00:00 +0000
28@@ -1,7 +0,0 @@
29-# DO NOT EDIT - MANAGED BY SALT!
30-
31-device_type = rtsm_ve-armv8
32-
33-license_file = "8224@control"
34-
35-interfaceName = armv8_02
36
37=== removed file 'lava/devices/fastmodels02/production/rtsm_ve-armv8_01.conf'
38--- lava/devices/fastmodels02/production/rtsm_ve-armv8_01.conf 2013-06-25 06:11:06 +0000
39+++ lava/devices/fastmodels02/production/rtsm_ve-armv8_01.conf 1970-01-01 00:00:00 +0000
40@@ -1,7 +0,0 @@
41-# DO NOT EDIT - MANAGED BY SALT!
42-
43-device_type = rtsm_ve-armv8
44-
45-license_file = "8224@control"
46-
47-interfaceName = armv8_01
48
49=== modified file 'lava/devices/fastmodels02/production/rtsm_ve-armv8_02.conf'
50--- lava/devices/fastmodels02/production/rtsm_ve-armv8_02.conf 2013-06-25 06:11:06 +0000
51+++ lava/devices/fastmodels02/production/rtsm_ve-armv8_02.conf 2013-09-17 02:07:10 +0000
52@@ -4,4 +4,4 @@
53
54 license_file = "8224@control"
55
56-interfaceName = armv8_02
57+interfaceName = armv8_01
58
59=== added file 'lava/devices/fastmodels03/production/rtsm_ve-armv8_03.conf'
60--- lava/devices/fastmodels03/production/rtsm_ve-armv8_03.conf 1970-01-01 00:00:00 +0000
61+++ lava/devices/fastmodels03/production/rtsm_ve-armv8_03.conf 2013-09-17 02:07:10 +0000
62@@ -0,0 +1,7 @@
63+# DO NOT EDIT - MANAGED BY SALT!
64+
65+device_type = rtsm_ve-armv8
66+
67+license_file = "8224@control"
68+
69+interfaceName = armv8_01
70
71=== removed file 'lava/devices/fastmodels03/production/rtsm_ve-armv8_05.conf'
72--- lava/devices/fastmodels03/production/rtsm_ve-armv8_05.conf 2013-06-25 06:11:06 +0000
73+++ lava/devices/fastmodels03/production/rtsm_ve-armv8_05.conf 1970-01-01 00:00:00 +0000
74@@ -1,7 +0,0 @@
75-# DO NOT EDIT - MANAGED BY SALT!
76-
77-device_type = rtsm_ve-armv8
78-
79-license_file = "8224@control"
80-
81-interfaceName = armv8_01

Subscribers

People subscribed via source and target branches