Merge lp:~lucio.torre/txstatsd/make-ewma-sampling-faster into lp:txstatsd

Proposed by Lucio Torre
Status: Rejected
Rejected by: Lucio Torre
Proposed branch: lp:~lucio.torre/txstatsd/make-ewma-sampling-faster
Merge into: lp:txstatsd
Diff against target: 125 lines (+36/-10)
2 files modified
txstatsd/stats/exponentiallydecayingsample.py (+12/-8)
txstatsd/tests/stats/test_exponentiallydecayingsample.py (+24/-2)
To merge this branch: bzr merge lp:~lucio.torre/txstatsd/make-ewma-sampling-faster
Reviewer Review Type Date Requested Status
txStatsD Developers Pending
Review via email: mp+90778@code.launchpad.net

Description of the change

make ewma sampling take 23% of the time it takes now

To post a comment you must log in.
64. By Lucio Torre

fix rescaling

Revision history for this message
Lucio Torre (lucio.torre) wrote :

now it also changes the rescale threshold so the difference is never bigger than 600 because:

In [21]: math.exp(717)
---------------------------------------------------------------------------
OverflowError Traceback (most recent call last)

/home/lucio/canonical/txstatsd/fix-memory-issues/<ipython console> in <module>()

OverflowError: math range error

65. By Lucio Torre

skip long test

66. By Lucio Torre

make comment consistent

Unmerged revisions

66. By Lucio Torre

make comment consistent

65. By Lucio Torre

skip long test

64. By Lucio Torre

fix rescaling

63. By Lucio Torre

moved to insort

62. By Sidnei da Silva

[r=lucio.torre] Split prefix from instance name, make ReportingService use instance name as prefix for reporting own metrics

61. By Sidnei da Silva

- Fix some docstrings

60. By Sidnei da Silva

- Delay reactor import so a custom one can be installed.

59. By Lucio Torre

[r=lucio.torre,sidnei] add routing

58. By Sidnei da Silva

[r=caravone,lucio.torre] Replace GraphiteClientProtocol with CarbonClientManager

57. By Sidnei da Silva

- Fix flushing paused message counter [trivial]

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'txstatsd/stats/exponentiallydecayingsample.py'
2--- txstatsd/stats/exponentiallydecayingsample.py 2011-09-14 12:01:10 +0000
3+++ txstatsd/stats/exponentiallydecayingsample.py 2012-01-30 20:47:26 +0000
4@@ -1,4 +1,4 @@
5-
6+import bisect
7 import math
8 import random
9 import time
10@@ -19,10 +19,10 @@
11 library/publications/CormodeShkapenyukSrivastavaXu09.pdf>}
12 """
13
14- # 1 hour (in seconds)
15- RESCALE_THRESHOLD = 60 * 60
16+ # 10 minutes (in seconds)
17+ RESCALE_THRESHOLD = 60 * 10
18
19- def __init__(self, reservoir_size, alpha):
20+ def __init__(self, reservoir_size, alpha, wall_time=None):
21 """Creates a new C{ExponentiallyDecayingSample}.
22
23 @param reservoir_size: The number of samples to keep in the sampling
24@@ -30,6 +30,7 @@
25 @parama alpha: The exponential decay factor; the higher this is,
26 the more biased the sample will be towards newer values.
27 """
28+ #self._values = []
29 self._values = dict()
30 self.alpha = alpha
31 self.reservoir_size = reservoir_size
32@@ -38,6 +39,9 @@
33 self.start_time = 0
34 self.next_scale_time = 0
35
36+ if wall_time is None:
37+ wall_time = time.time
38+ self._wall_time = wall_time
39 self.clear()
40
41 def clear(self):
42@@ -45,7 +49,7 @@
43 self.count = 0
44 self.start_time = self.tick()
45 self.next_scale_time = (
46- time.time() + ExponentiallyDecayingSample.RESCALE_THRESHOLD)
47+ self._wall_time() + self.RESCALE_THRESHOLD)
48
49 def size(self):
50 return min(self.reservoir_size, self.count)
51@@ -74,7 +78,7 @@
52 self._values[priority] = value
53 del self._values[first]
54
55- now = time.time()
56+ now = self._wall_time()
57 next = self.next_scale_time
58 if now >= next:
59 self.rescale(now, next)
60@@ -84,7 +88,7 @@
61 return [self._values[k] for k in keys]
62
63 def tick(self):
64- return time.time()
65+ return self._wall_time()
66
67 def weight(self, t):
68 return math.exp(self.alpha * t)
69@@ -113,7 +117,7 @@
70 """
71
72 self.next_scale_time = (
73- now + ExponentiallyDecayingSample.RESCALE_THRESHOLD)
74+ now + self.RESCALE_THRESHOLD)
75 old_start_time = self.start_time
76 self.start_time = self.tick()
77 keys = sorted(self._values.keys())
78
79=== modified file 'txstatsd/tests/stats/test_exponentiallydecayingsample.py'
80--- txstatsd/tests/stats/test_exponentiallydecayingsample.py 2011-09-14 12:01:10 +0000
81+++ txstatsd/tests/stats/test_exponentiallydecayingsample.py 2012-01-30 20:47:26 +0000
82@@ -1,3 +1,4 @@
83+import random
84
85 from unittest import TestCase
86
87@@ -26,7 +27,7 @@
88 for i in population:
89 sample.update(i)
90
91- self.assertEqual(sample.size(), 10, 'Should have 10 elements')
92+ self.assertEqual(sample.size(), 10)
93 self.assertEqual(len(sample.get_values()), 10,
94 'Should have 10 elements')
95 self.assertEqual(
96@@ -42,6 +43,27 @@
97 self.assertEqual(sample.size(), 100, 'Should have 100 elements')
98 self.assertEqual(len(sample.get_values()), 100,
99 'Should have 100 elements')
100+
101 self.assertEqual(
102 len(set(sample.get_values()).difference(set(population))), 0,
103- 'Should only have elements from the population')
104\ No newline at end of file
105+ 'Should only have elements from the population')
106+
107+ def test_ewma_sample_load(self):
108+
109+ _time = [10000]
110+
111+ def wtime():
112+ return _time[0]
113+
114+ sample = ExponentiallyDecayingSample(100, 0.99, wall_time=wtime)
115+ sample.RESCALE_THRESHOLD = 100
116+ sample.clear()
117+ for i in xrange(10000000):
118+ sample.update(random.normalvariate(0, 10))
119+ _time[0] += 1
120+
121+ self.assertEqual(sample.size(), 100)
122+ self.assertEqual(len(sample.get_values()), 100,
123+ 'Should have 100 elements')
124+ test_ewma_sample_load.skip = "takes too long to run"
125+

Subscribers

People subscribed via source and target branches