Merge lp:~asac/ubuntu-test-cases/use-top-and-always-dumb-toplog into lp:ubuntu-test-cases/touch

Proposed by Alexander Sack
Status: Merged
Merged at revision: 12
Proposed branch: lp:~asac/ubuntu-test-cases/use-top-and-always-dumb-toplog
Merge into: lp:ubuntu-test-cases/touch
Diff against target: 140 lines (+32/-33)
1 file modified
systemsettle/systemsettle.sh (+32/-33)
To merge this branch: bzr merge lp:~asac/ubuntu-test-cases/use-top-and-always-dumb-toplog
Reviewer Review Type Date Requested Status
Paul Larson Approve
Review via email: mp+180889@code.launchpad.net

This proposal supersedes a proposal from 2013-08-19.

Description of the change

+ use top in batchmode instead of vmstat
+ improve logging; echo cli arguments given as well as
+ dumb top_log regardless of success and failyure
+ top_log is now more useful to see what was going on as we dumb exactly what was measured
+ addresses all previous comments (minus request for optionally dumping top log)

To post a comment you must log in.
Revision history for this message
Paul Larson (pwlars) wrote : Posted in a previous version of this proposal

I haven't looked too deeply just yet, but there are some things I see after a quick trial run locally:

>Measurement:
> + idle level: 148.50
> + idle sum: 148.5 / count: 2
>
>system settled. SUCCESS
I told it to only pass if it hit 99% or better, and my system was constantly at 80% or less idle... clearly this is not good.

Using top in the way you are using it right now means that we only see that it's starting the first pass, but then we don't get any more feedback that something is happening until the end of the entire run.

27 +echo " cmd = 'top -b -d $vmstat_wait -n $vmstat_repeat' ignoring first $vmstat_ignore (tail: $vmstat_tail)"
Instead of printing out 7 lines of verbose detail about the options we just passed it, I think it would make more sense to just have the above line do something like print the exact command line args. I had thought about changing this in the first pass, but waited.

48 + top -b -d $vmstat_wait -n $vmstat_repeat >> $top_log
49 + cat $top_log | grep '.Cpu.*' | tail -n $vmstat_tail > $vmstat_log.reduced
If we are no longer using vmstat, we may as well change the variable names to reflect that as well.

Printing the entire top log of every run is not always desirable, and sometimes gets in the way. Considering our primary use, I think it's sensible to have it on by default, but we should at least have an option to turn it off.

review: Needs Fixing
Revision history for this message
Alexander Sack (asac) wrote : Posted in a previous version of this proposal

> I haven't looked too deeply just yet, but there are some things I see after a
> quick trial run locally:
>
> >Measurement:
> > + idle level: 148.50
> > + idle sum: 148.5 / count: 2
> >
> >system settled. SUCCESS
> I told it to only pass if it hit 99% or better, and my system was constantly
> at 80% or less idle... clearly this is not good.

I will fix the problem you saw with your 140% idle average ... need a way to reproduce. The script works here on my system and on the phone.

>
> Using top in the way you are using it right now means that we only see that
> it's starting the first pass, but then we don't get any more feedback that
> something is happening until the end of the entire run.

We see the result... not the pumping. Instead you get everything nicely at the end. I think its good that way. I will tweak the output to be clearer that its just "measuring now..."

>
> 27 +echo " cmd = 'top -b -d $vmstat_wait -n $vmstat_repeat' ignoring
> first $vmstat_ignore (tail: $vmstat_tail)"
> Instead of printing out 7 lines of verbose detail about the options we just
> passed it, I think it would make more sense to just have the above line do
> something like print the exact command line args. I had thought about changing
> this in the first pass, but waited.

Yeah let me do something about this. will probably remove that line because we have all the details now. And then using set -x set +x to turn on and off echoing the command line (so we really see what happens).

>
> 48 + top -b -d $vmstat_wait -n $vmstat_repeat >> $top_log
> 49 + cat $top_log | grep '.Cpu.*' | tail -n $vmstat_tail >
> $vmstat_log.reduced
> If we are no longer using vmstat, we may as well change the variable names to
> reflect that as well.
>
> Printing the entire top log of every run is not always desirable, and
> sometimes gets in the way. Considering our primary use, I think it's sensible
> to have it on by default, but we should at least have an option to turn it
> off.

I see that some might want it, but then I don't think that use case is really relevant to maintain right now... unless you have a real case where we dont want it in automation.

The

Revision history for this message
Alexander Sack (asac) wrote : Posted in a previous version of this proposal

pushed an updated version which addresses all comments except the request for conveniently disabling dumping the toplog at the end.

Revision history for this message
Paul Larson (pwlars) wrote :

Tested the new version and it fixes the bug I saw.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'systemsettle/systemsettle.sh'
--- systemsettle/systemsettle.sh 2013-08-16 04:33:28 +0000
+++ systemsettle/systemsettle.sh 2013-08-19 15:01:53 +0000
@@ -9,19 +9,17 @@
99
10cleanup () {10cleanup () {
11 if ! test "$dump_error" = 0; then11 if ! test "$dump_error" = 0; then
12 echo "System failed to settle to target idle level ($idle_avg_min)"12 echo "Check out the following top log taken at each retry:"
13 echo " + check out the following top log taken at each retry:"
1413
14 echo
15 # dumb toplog indented15 # dumb toplog indented
16 while read line; do16 while read line; do
17 echo " $line"17 echo " $line"
18 done < $top_log18 done < $top_log
19
20 echo
21 # dont rerun this logic in case we get multiple signals19 # dont rerun this logic in case we get multiple signals
22 dump_error=020 dump_error=0
23 fi21 fi
24 rm -f $top_log $vmstat_log $vmstat_log.reduced22 rm -f $top_log $top_log.reduced
25}23}
2624
27function show_usage() {25function show_usage() {
@@ -30,10 +28,10 @@
30 echo "Options:"28 echo "Options:"
31 echo " -r run forever without exiting"29 echo " -r run forever without exiting"
32 echo " -p minimum idle percent to wait for (Default: 99)"30 echo " -p minimum idle percent to wait for (Default: 99)"
33 echo " -c number of times to run vmstat at each iteration (Default: 10)"31 echo " -c number of times to run top at each iteration (Default: 10)"
34 echo " -d seconds to delay between each vmstat iteration (Default: 6)"32 echo " -d seconds to delay between each top iteration (Default: 6)"
35 echo " -i vmstat measurements to ignore from each loop (Default: 1)"33 echo " -i top measurements to ignore from each loop (Default: 1)"
36 echo " -m maximum loops of vmstat before giving up if minimum idle"34 echo " -m maximum loops of top before giving up if minimum idle"
37 echo " percent is not reached (Default: 1)"35 echo " percent is not reached (Default: 1)"
38 exit 12936 exit 129
39}37}
@@ -46,11 +44,11 @@
46 ;;44 ;;
47 p) idle_avg_min=$OPTARG45 p) idle_avg_min=$OPTARG
48 ;;46 ;;
49 c) vmstat_repeat=$OPTARG47 c) top_repeat=$OPTARG
50 ;;48 ;;
51 d) vmstat_wait=$OPTARG49 d) top_wait=$OPTARG
52 ;;50 ;;
53 i) vmstat_ignore=$OPTARG51 i) top_ignore=$OPTARG
54 ;;52 ;;
55 m) settle_max=$OPTARG53 m) settle_max=$OPTARG
56 ;;54 ;;
@@ -59,54 +57,56 @@
5957
60# minimum average idle level required to succeed58# minimum average idle level required to succeed
61idle_avg_min=${idle_avg_min:-99}59idle_avg_min=${idle_avg_min:-99}
62# measurement details: vmstat $vmstat_wait $vmstat_repeat60# measurement details: top $top_wait $top_repeat
63vmstat_repeat=${vmstat_repeat:-10}61top_repeat=${top_repeat:-10}
64vmstat_wait=${vmstat_wait:-6}62top_wait=${top_wait:-6}
65# how many samples to ignore63# how many samples to ignore
66vmstat_ignore=${vmstat_ignore:-1}64top_ignore=${top_ignore:-1}
67# how many total attempts to settle the system65# how many total attempts to settle the system
68settle_max=${settle_max:-10}66settle_max=${settle_max:-10}
6967
70# set and calc more runtime values68# set and calc more runtime values
71vmstat_tail=`calc $vmstat_repeat - $vmstat_ignore`69top_tail=`calc $top_repeat - $top_ignore`
72settle_count=070settle_count=0
73idle_avg=071idle_avg=0
7472
75echo "System Settle run - quiesce the system"73echo "System Settle run - quiesce the system"
76echo "--------------------------------------"74echo "--------------------------------------"
77echo75echo
78echo " + cmd: \'vmstat $vmstat_wait $vmstat_repeat\' ignoring first $vmstat_ignore (tail: $vmstat_tail)"76echo " idle_avg_min = '$idle_avg_min'"
77echo " top_repeat = '$top_repeat'"
78echo " top_wait = '$top_wait'"
79echo " top_ignore = '$top_ignore'"
80echo " settle_max = '$settle_max'"
81echo " run_forever = '$settle_prefix' (- = yes)"
79echo82echo
8083
81trap cleanup EXIT INT QUIT ILL KILL SEGV TERM84trap cleanup EXIT INT QUIT ILL KILL SEGV TERM
82vmstat_log=`mktemp -t`
83top_log=`mktemp -t`85top_log=`mktemp -t`
8486
85while test `calc $idle_avg '<' $idle_avg_min` = 1 -a "$settle_prefix$settle_count" -lt "$settle_max"; do87while test `calc $idle_avg '<' $idle_avg_min` = 1 -a "$settle_prefix$settle_count" -lt "$settle_max"; do
86 echo Starting settle run $settle_count:88 echo -n "Starting system idle measurement (run: $settle_count) ... "
8789
88 # get vmstat90 # get top
89 vmstat $vmstat_wait $vmstat_repeat | tee $vmstat_log
90 cat $vmstat_log | tail -n $vmstat_tail > $vmstat_log.reduced
91
92 # log top output for potential debugging
93 echo "TOP DUMP (after settle run: $settle_count)" >> $top_log91 echo "TOP DUMP (after settle run: $settle_count)" >> $top_log
94 echo "========================" >> $top_log92 echo "========================" >> $top_log
95 top -n 1 -b >> $top_log93 top -b -d $top_wait -n $top_repeat >> $top_log
94 cat $top_log | grep '.Cpu.*' | tail -n $top_tail > $top_log.reduced
96 echo >> $top_log95 echo >> $top_log
9796
98 # calc average of idle field for this measurement97 # calc average of idle field for this measurement
99 sum=098 sum=0
100 count=099 count=0
101 while read line; do100 while read line; do
102 idle=`echo $line | sed -e 's/\s\s*/ /g' | cut -d ' ' -f 15`101 idle=`echo $line | sed -e 's/.* \([0-9\.]*\) id.*/\1/'`
103 sum=`calc $sum + $idle`102 sum=`calc $sum + $idle`
104 count=`calc $count + 1`103 count=`calc $count + 1`
105 done < $vmstat_log.reduced104 done < $top_log.reduced
106105
107 idle_avg=`calc $sum.0 / $count.0`106 idle_avg=`calc $sum / $count`
108 settle_count=`calc $settle_count + 1`107 settle_count=`calc $settle_count + 1`
109108
109 echo " DONE."
110 echo110 echo
111 echo "Measurement:"111 echo "Measurement:"
112 echo " + idle level: $idle_avg"112 echo " + idle level: $idle_avg"
@@ -119,7 +119,6 @@
119 exit 1119 exit 1
120else120else
121 echo "system settled. SUCCESS"121 echo "system settled. SUCCESS"
122 dump_error=0
123 exit 0122 exit 0
124fi123fi
125124

Subscribers

People subscribed via source and target branches