“Canonical Kernel Team” team

Code review comment for ~canonical-kernel-team/+git/autotest-client-tests:ubuntu_stress_smoke_split_test

Git
lp:~canonical-kernel-team/+git/autotest-client-tests
ubuntu_stress_smoke_split_test
Merge into master

Revision history for this message

Po-Hsu Lin (cypressyew) wrote on 2023-12-07:

Cool! Thanks for fixing this.
The test case number is correct now (setup + 267 stressors + swapoff)

I just found out there is a "sleep 15" command in the shell script, right after setting the oom level. That's the real cause of this time-consuming issue. I think that one should be moved out as well. (It's still a good thing to not swap on/off repeatedly.)

Also, there is one issue with doing swap configuration in setup():

The logic for autotest is that the setup() will get triggered only once if the test case version didn't get bumped. So when you try to run the same test suite on the same machine without re-provisioning it or removing autotest/client/tmp/<test suite> directory, the swap won't be enabled on your second attempt. This happens when we need to debug some issues manually.

This is the workflow:
1. initialize()
2. setup()
3. run_once() with stressors
4. swapoff()

With a second run immediately after the first one:
5. initialize()
6. run_once() with stressors
7. swapoff()

I can think of two possible solutions now.
1. Like when we did for ubuntu_lttng_smoke_test [1], move these pre-test configuration codes from setup() to initialize(), in which it is guaranteed to be executed on the very beginning of every run. (The only problem is that this will begin before setup(), it will fail if any of these action depends on a package that needs to be installed in setup() first)
2. Create a new job like the swapoff() you made here, and run it before iterating through all of the stressors

Solution 2 might be more straightforward, we can break the smoke_test shell script into 3 parts:
1. Pre-test configuration like check_machine(), sleep 15 and set_max_oom_level into one script. And determine if we need to skip the stressors base on the check_machine() result there. (For example a VM with small ram. Now it will still iterate through all the tests but without doing any real things. These skipped tests will be treated as PASSED and it cloud give us a false impression that this has been tested and the result is good.)
2. The stressor execution part, which is done.
3. cleanup, swapoff, zswap and cgroup_mem restoration part.

Part 1 and 3 can be embedded in ubuntu_stress_smoke_test.py like what you have already done for swapoff() here. Or just stay in a script and call them within ubuntu_stress_smoke_test.py. The latter might give us more flexibility whenever we need to move away from autotest. But I am good with both choices since we're not planning to do this now.

Sorry for getting this comment too lengthy, please feel free to ping me is you have any question.

[1] https://git.launchpad.net/~canonical-kernel-team/+git/autotest-client-tests/commit/?id=ca3878e0ecac7a284ad6ee0175db72e1d9cefa9d

Cool! Thanks for fixing this.
The test case number is correct now (setup + 267 stressors + swapoff)

Also, there is one issue with doing swap configuration in setup():

This is the workflow:
1. initialize()
2. setup()
3. run_once() with stressors
4. swapoff()
    
With a second run immediately after the first one:
5. initialize()
6. run_once() with stressors
7. swapoff()

Sorry for getting this comment too lengthy, please feel free to ping me is you have any question.

[1] https://git.launchpad.net/~canonical-kernel-team/+git/autotest-client-tests/commit/?id=ca3878e0ecac7a284ad6ee0175db72e1d9cefa9d

review: Needs Fixing

« Back to merge proposal