Merge lp:~ziggy/graphite/whitelist into lp:~graphite-dev/graphite/main

Proposed by Eric Ziegenhorn
Status: Superseded
Proposed branch: lp:~ziggy/graphite/whitelist
Merge into: lp:~graphite-dev/graphite/main
Diff against target: 405 lines (+304/-0) (has conflicts)
5 files modified
carbon/bin/carbon-cache.py (+162/-0)
carbon/conf/carbon.conf.example (+12/-0)
carbon/conf/whitelist (+1/-0)
carbon/lib/carbon/protocols.py (+59/-0)
carbon/lib/carbon/writer.py (+70/-0)
Text conflict in carbon/bin/carbon-cache.py
Text conflict in carbon/conf/carbon.conf.example
Text conflict in carbon/lib/carbon/protocols.py
To merge this branch: bzr merge lp:~ziggy/graphite/whitelist
Reviewer Review Type Date Requested Status
chrismd Needs Fixing
Review via email: mp+45710@code.launchpad.net

This proposal has been superseded by a proposal from 2011-12-14.

Description of the change

We would be completely overwhelmed by the volume of metrics flowing into Graphite if we accepted everything that came our way. I can't control what is sent to Graphite, but I can control what is accepted into Graphite. To implement that, we've created whitelist and blacklist functionality that respectively allows or denies metrics entry into Graphite. This code is based on code that has been running in production for months with several thousand metrics per minute after filtering.

Each list is a file with one regular expression per line. If a metric matches a regex in the blacklist it is dropped. Only if a metric matches a regex in the whitelist will it be accepted by carbon.

For example, a whitelist consisting of the line:
^server\.load\.
will allow any metric that starts with "server.load." into carbon, unless it also matches a regex in the blacklist.

To test, uncomment the WHITELIST and/or BLACKLIST lines in carbon.conf and add one or more regular expressions to graphite/conf/whitelist and/or graphite/conf/blacklist. Restart carbon for your changes to take effect. Metrics which are rejected are logged in listener.log.

To post a comment you must log in.
Revision history for this message
Jon Stevens (latchkey-gmail) wrote :

I personally think that this should be done through a separate proxy sitting in front of carbon-cache.py. It seems like by the time carbon accepts the request and parses it, you might as well just record it.

Revision history for this message
Eric Ziegenhorn (ziggy) wrote :

But that takes storage space, the time to create it, etc. I agree that
would be a good architecture though if a single proxy could handle the flow
in real time, and I'd guess it could in most cases. Note that filtering
happens right when the request hits carbon though, not after much
processing. It never gets to the metric queue if it isn't whitelisted.

On Mon, Jan 10, 2011 at 12:13 PM, Jon Stevens <email address hidden> wrote:

> I personally think that this should be done through a separate proxy
> sitting in front of carbon-cache.py. It seems like by the time carbon
> accepts the request and parses it, you might as well just record it.
> --
> https://code.launchpad.net/~ziggy/graphite/whitelist/+merge/45710<https://code.launchpad.net/%7Eziggy/graphite/whitelist/+merge/45710>
> You are the owner of lp:~ziggy/graphite/whitelist.
>

Revision history for this message
chrismd (chrismd) wrote :

Sorry for the delayed response. Actually carbon-cache already has a whitelist feature, it broke a while back because a default value for its config option went away somehow but I just fixed that in trunk. Anyways, it uses an exact list of metric names for whitelisting and doesn't provide blacklisting so your patch definitely adds value.

However you use a module I've never heard of, regexlist. I would prefer to not add another thirdparty dependency if possible and in this case I think it would be pretty simple to write a quick function to read the regexes from a file into a list. Also the diffs against carbon-cache.py no longer apply since Sidnei's refactoring to use twistd will be merged soon.

Thanks for submitting this, I'm sure many will find it useful once we get it merged in

review: Needs Fixing
Revision history for this message
Eric Ziegenhorn (ziggy) wrote :

I wrote regexlist module specifically for this feature. I could in-line it
somewhere if that is preferrable. I can update it to apply to trunk in the
not-too-distant future if there is nothing else to re-work first, just let
me know. Thanks!

On Sun, Jul 10, 2011 at 5:22 PM, chrismd <email address hidden> wrote:

> Review: Needs Fixing
> Sorry for the delayed response. Actually carbon-cache already has a
> whitelist feature, it broke a while back because a default value for its
> config option went away somehow but I just fixed that in trunk. Anyways, it
> uses an exact list of metric names for whitelisting and doesn't provide
> blacklisting so your patch definitely adds value.
>
> However you use a module I've never heard of, regexlist. I would prefer to
> not add another thirdparty dependency if possible and in this case I think
> it would be pretty simple to write a quick function to read the regexes from
> a file into a list. Also the diffs against carbon-cache.py no longer apply
> since Sidnei's refactoring to use twistd will be merged soon.
>
> Thanks for submitting this, I'm sure many will find it useful once we get
> it merged in
> --
> https://code.launchpad.net/~ziggy/graphite/whitelist/+merge/45710<https://code.launchpad.net/%7Eziggy/graphite/whitelist/+merge/45710>
> You are the owner of lp:~ziggy/graphite/whitelist.
>

Revision history for this message
chrismd (chrismd) wrote :

Sure just inline the function. The only thing I'd change is the if 'BLACKLIST' in settings check being done for each line received, ideally that lookup only need to be done once outside the lineReceived function.

> I wrote regexlist module specifically for this feature. I could in-line it
> somewhere if that is preferrable. I can update it to apply to trunk in the
> not-too-distant future if there is nothing else to re-work first, just let
> me know. Thanks!
>

lp:~ziggy/graphite/whitelist updated
337. By Eric Ziegenhorn

Added config options for logging whitelist/blacklist misses/hits.
Moved RegexList class in to writer.py

338. By Eric Ziegenhorn

Added a default whitelist entry to allow everything in

339. By Eric Ziegenhorn

Add support for comments in whitelist/blacklist files

340. By Eric Ziegenhorn

Removed explicit directory chmod as that is not supposed to be part of this tree

Revision history for this message
Eric Ziegenhorn (ziggy) wrote :

I in-lined the RegexList class in writer.py as per your request. I changed the checks in writeCachedDataPoints() to try and make them faster especially in the case you're not using a whitelist or blacklist, but I don't see a way to get rid of them entirely. Feel free to enlighten me. :) I added more config settings for logging related to the whitelist or blacklist because it can be very useful to know where your metrics are going but sometimes it is too slow. I can tweak/get rid of that as you'd prefer. I can write proper documentation for this for the wiki if and when you want. I have this refactoring running in dev, but will push out to staging and production for further testing if I'm heading in the right direction for getting it upstream.

Additionally, I made a small standalone web interface for checking a metric to see if it would be whitelisted or blacklisted and if so, which regexes it matches. This has been helpful because when you get lots of whitelist or blacklist entries it can be hard to tell if a given metric matches or not. It's ~130 lines of python (+~2700 lines for the bottle.py micro-framework). Let me know if that'd be useful to include or reference with Graphite.

Revision history for this message
chrismd (chrismd) wrote :

Hi Ziggy, thanks for the updates. Unfortunately this patch doesn't apply cleanly to trunk because of all the recent carbon refactoring (ie. carbon-cache.py on trunk is 100% different than the source this branch derives from). It probably won't take much to sync it with current trunk but that should definitely happen as a trunk merge into this branch rather than the other way around, at least at first. Can you 'bzr merge lp:graphite' into your branch, resolve the conflicts and verify everything still works? If we can get this branch into a state where it cleanly merges into trunk then we'll be good to go. Thanks.

Unmerged revisions

340. By Eric Ziegenhorn

Removed explicit directory chmod as that is not supposed to be part of this tree

339. By Eric Ziegenhorn

Add support for comments in whitelist/blacklist files

338. By Eric Ziegenhorn

Added a default whitelist entry to allow everything in

337. By Eric Ziegenhorn

Added config options for logging whitelist/blacklist misses/hits.
Moved RegexList class in to writer.py

336. By Eric Ziegenhorn

Work with WHITELIST and/or BLACKLIST commented out in carbon.conf

335. By Eric Ziegenhorn

Improved comments in example carbon.conf

334. By Eric Ziegenhorn

Added blank whitelist and blacklist files

333. By Eric Ziegenhorn

Added whitelist/blacklist functionality

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'carbon/bin/carbon-cache.py'
2--- carbon/bin/carbon-cache.py 2011-07-11 19:24:57 +0000
3+++ carbon/bin/carbon-cache.py 2011-08-26 21:17:16 +0000
4@@ -25,6 +25,168 @@
5 LIB_DIR = join(ROOT_DIR, 'lib')
6 sys.path.insert(0, LIB_DIR)
7
8+<<<<<<< TREE
9 from carbon.util import run_twistd_plugin
10
11 run_twistd_plugin(__file__)
12+=======
13+# Capture useful debug info for this commonly reported problem
14+try:
15+ import carbon
16+except ImportError:
17+ print 'Failed to import carbon, debug information follows.'
18+ print 'pwd=%s' % os.getcwd()
19+ print 'sys.path=%s' % sys.path
20+ print '__file__=%s' % __file__
21+ sys.exit(1)
22+
23+
24+# Parse command line options
25+parser = optparse.OptionParser(usage='%prog [options] <start|stop|status>')
26+parser.add_option('--debug', action='store_true', help='Run in the foreground, log to stdout')
27+parser.add_option('--profile', help='Record performance profile data to the given file')
28+parser.add_option('--pidfile', default=join(STORAGE_DIR, '%s.pid' % program.split('.')[0]), help='Write pid to the given file')
29+parser.add_option('--config', default=join(CONF_DIR, 'carbon.conf'), help='Use the given config file')
30+parser.add_option('--logdir', default=LOG_DIR, help="Write logs in the given directory")
31+
32+(options, args) = parser.parse_args()
33+
34+if not args:
35+ parser.print_usage()
36+ raise SystemExit(1)
37+
38+action = args[0]
39+
40+if action == 'stop':
41+ if not exists(options.pidfile):
42+ print 'Pidfile %s does not exist' % options.pidfile
43+ raise SystemExit(0)
44+
45+ pidfile = open(options.pidfile, 'r')
46+ try:
47+ pid = int( pidfile.read().strip() )
48+ except:
49+ print 'Could not read pidfile %s' % options.pidfile
50+ raise SystemExit(1)
51+
52+ print 'Deleting %s (contained pid %d)' % (options.pidfile, pid)
53+ os.unlink(options.pidfile)
54+
55+ print 'Sending kill signal to pid %d' % pid
56+ os.kill(pid, 15)
57+ raise SystemExit(0)
58+
59+
60+elif action == 'status':
61+ if not exists(options.pidfile):
62+ print '%s is not running' % program
63+ raise SystemExit(0)
64+
65+ pidfile = open(options.pidfile, 'r')
66+ try:
67+ pid = int( pidfile.read().strip() )
68+ except:
69+ print 'Failed to read pid from %s' % options.pidfile
70+ raise SystemExit(1)
71+
72+ if exists('/proc/%d' % pid):
73+ print "%s is running with pid %d" % (program, pid)
74+ raise SystemExit(0)
75+ else:
76+ print "%s is not running" % program
77+ raise SystemExit(0)
78+
79+elif action != 'start':
80+ parser.print_usage()
81+ raise SystemExit(1)
82+
83+
84+if exists(options.pidfile):
85+ print "Pidfile %s already exists, is %s already running?" % (options.pidfile, program)
86+ raise SystemExit(1)
87+
88+# Import application components
89+from carbon.conf import settings
90+# Read config (we want failures to occur before daemonizing) and settings is needed in some imported modules below
91+settings.readFrom(options.config, 'cache')
92+from carbon.log import logToStdout, logToDir
93+from carbon.listeners import MetricLineReceiver, MetricPickleReceiver, CacheQueryHandler, startListener
94+from carbon.cache import MetricCache
95+from carbon.writer import startWriter
96+from carbon.instrumentation import startRecordingCacheMetrics
97+from carbon.events import metricReceived
98+
99+use_amqp = settings.get("ENABLE_AMQP", False)
100+if use_amqp:
101+ from carbon import amqp_listener
102+ amqp_host = settings.get("AMQP_HOST", "localhost")
103+ amqp_port = settings.get("AMQP_PORT", 5672)
104+ amqp_user = settings.get("AMQP_USER", "guest")
105+ amqp_password = settings.get("AMQP_PASSWORD", "guest")
106+ amqp_verbose = settings.get("AMQP_VERBOSE", False)
107+ amqp_vhost = settings.get("AMQP_VHOST", "/")
108+ amqp_spec = settings.get("AMQP_SPEC", None)
109+ amqp_exchange_name = settings.get("AMQP_EXCHANGE", "graphite")
110+
111+
112+# --debug
113+if options.debug:
114+ logToStdout()
115+
116+else:
117+ if not isdir(options.logdir):
118+ os.makedirs(options.logdir)
119+
120+ if settings.USER:
121+ print "Dropping privileges to become the user %s" % settings.USER
122+
123+ from carbon.util import daemonize, dropprivs
124+ daemonize()
125+
126+ pidfile = open(options.pidfile, 'w')
127+ pidfile.write( str(os.getpid()) )
128+ pidfile.close()
129+
130+ def shutdown():
131+ if os.path.exists(options.pidfile):
132+ os.unlink(options.pidfile)
133+
134+ atexit.register(shutdown)
135+
136+ if settings.USER:
137+ pwent = pwd.getpwnam(settings.USER)
138+ os.chown(options.pidfile, pwent.pw_uid, pwent.pw_gid)
139+ dropprivs(settings.USER)
140+
141+ logToDir(options.logdir)
142+
143+# Configure application components
144+metricReceived.installHandler(MetricCache.store)
145+startListener(settings.LINE_RECEIVER_INTERFACE, settings.LINE_RECEIVER_PORT, MetricLineReceiver)
146+startListener(settings.PICKLE_RECEIVER_INTERFACE, settings.PICKLE_RECEIVER_PORT, MetricPickleReceiver)
147+startListener(settings.CACHE_QUERY_INTERFACE, settings.CACHE_QUERY_PORT, CacheQueryHandler)
148+
149+if use_amqp:
150+ amqp_listener.startReceiver(amqp_host, amqp_port, amqp_user, amqp_password,
151+ vhost=amqp_vhost, spec=amqp_spec,
152+ exchange_name=amqp_exchange_name,
153+ verbose=amqp_verbose)
154+
155+startWriter()
156+startRecordingCacheMetrics()
157+
158+
159+# Run the twisted reactor
160+print "%s running" % program
161+
162+if options.profile:
163+ import cProfile
164+
165+ if exists(options.profile):
166+ os.unlink(options.profile)
167+
168+ cProfile.run('reactor.run()', options.profile)
169+
170+else:
171+ reactor.run()
172+>>>>>>> MERGE-SOURCE
173
174=== added file 'carbon/conf/blacklist'
175=== modified file 'carbon/conf/carbon.conf.example'
176--- carbon/conf/carbon.conf.example 2011-08-15 07:09:56 +0000
177+++ carbon/conf/carbon.conf.example 2011-08-26 21:17:16 +0000
178@@ -82,10 +82,22 @@
179 # degrade performance if logging on the same volume as the whisper data is stored.
180 LOG_UPDATES = False
181
182+<<<<<<< TREE
183 # On some systems it is desirable for whisper to write synchronously.
184 # Set this option to True if you'd like to try this. Basically it will
185 # shift the onus of buffering writes from the kernel into carbon's cache.
186 WHISPER_AUTOFLUSH = False
187+=======
188+# Adding regular expressions to the whitelist makes carbon accept only matching metrics
189+#WHITELIST = /opt/graphite/conf/whitelist
190+# Logging metrics that don't match the whitelist can be useful but can degrade performance
191+#LOG_WHITELIST = False
192+
193+# Adding regular expressions to the blacklist prevents carbon from accepting matching metrics
194+#BLACKLIST = /opt/graphite/conf/blacklist
195+# Logging metrics that match the blacklist can be useful but can degrade performance
196+#LOG_BLACKLIST = False
197+>>>>>>> MERGE-SOURCE
198
199 # Enable AMQP if you want to receve metrics using an amqp broker
200 # ENABLE_AMQP = False
201
202=== added file 'carbon/conf/whitelist'
203--- carbon/conf/whitelist 1970-01-01 00:00:00 +0000
204+++ carbon/conf/whitelist 2011-08-26 21:17:16 +0000
205@@ -0,0 +1,1 @@
206+.*
207
208=== modified file 'carbon/lib/carbon/protocols.py'
209--- carbon/lib/carbon/protocols.py 2011-08-18 08:48:19 +0000
210+++ carbon/lib/carbon/protocols.py 2011-08-26 21:17:16 +0000
211@@ -2,13 +2,28 @@
212 from twisted.internet.protocol import DatagramProtocol
213 from twisted.internet.error import ConnectionDone
214 from twisted.protocols.basic import LineOnlyReceiver, Int32StringReceiver
215+<<<<<<< TREE
216 from carbon import log, events, state, management
217+=======
218+from carbon.cache import MetricCache
219+from carbon.relay import relay
220+from carbon.instrumentation import increment
221+from carbon.events import metricReceived
222+from carbon import log
223+from carbon.conf import settings
224+import regexlist
225+>>>>>>> MERGE-SOURCE
226
227 try:
228 import cPickle as pickle
229 except ImportError:
230 import pickle
231
232+# Load whitelist and blacklist
233+if 'WHITELIST' in settings:
234+ whitelist = regexlist.RegexList(settings.WHITELIST)
235+if 'BLACKLIST' in settings:
236+ blacklist = regexlist.RegexList(settings.BLACKLIST)
237
238 class MetricReceiver:
239 """ Base class for all metric receiving protocols, handles flow
240@@ -60,6 +75,7 @@
241 log.listener('invalid line received from client %s, ignoring' % self.peerName)
242 return
243
244+<<<<<<< TREE
245 events.metricReceived(metric, datapoint)
246
247
248@@ -76,6 +92,21 @@
249
250
251 class MetricPickleReceiver(MetricReceiver, Int32StringReceiver):
252+=======
253+ if 'BLACKLIST' in settings and blacklist.in_list(metric):
254+ log.listener('ignoring metric %s which is blacklisted' % metric)
255+ return
256+
257+ if 'WHITELIST' in settings and not whitelist.in_list(metric):
258+ log.listener('ignoring metric %s which is not whitelisted' % metric)
259+ return
260+
261+ increment('metricsReceived')
262+ metricReceived(metric, datapoint)
263+
264+
265+class MetricPickleReceiver(LoggingMixin, Int32StringReceiver):
266+>>>>>>> MERGE-SOURCE
267 MAX_LENGTH = 2 ** 20
268
269 def stringReceived(self, data):
270@@ -89,6 +120,7 @@
271 try:
272 datapoint = ( float(datapoint[0]), float(datapoint[1]) ) #force proper types
273 except:
274+<<<<<<< TREE
275 continue
276
277 if datapoint[1] == datapoint[1]: # filter out NaN values
278@@ -126,6 +158,33 @@
279 result = dict(error="Invalid request type \"%s\"" % request['type'])
280
281 response = pickle.dumps(result, protocol=-1)
282+=======
283+ log.listener('invalid pickle received from client %s, ignoring' % self.peerAddr)
284+ continue
285+
286+ if 'BLACKLIST' in settings and blacklist.in_list(metric):
287+ log.listener('ignoring metric %s which is blacklisted' % metric)
288+ continue
289+
290+ if 'WHITELIST' in settings and not whitelist.in_list(metric):
291+ log.listener('ignoring metric %s which is not whitelisted' % metric)
292+ continue
293+
294+ if datapoint[1] != datapoint[1]: # filter out NaN values
295+ log.listener('ingoring metric with NaN value, %s' % metric)
296+ continue
297+
298+ metricReceived(metric, datapoint)
299+
300+ increment('metricsReceived', len(datapoints))
301+
302+
303+class CacheQueryHandler(LoggingMixin, Int32StringReceiver):
304+ def stringReceived(self, metric):
305+ values = MetricCache.get(metric, [])
306+ log.query('cache query for %s returned %d values' % (metric, len(values)))
307+ response = pickle.dumps(values, protocol=-1)
308+>>>>>>> MERGE-SOURCE
309 self.sendString(response)
310
311
312
313=== modified file 'carbon/lib/carbon/writer.py'
314--- carbon/lib/carbon/writer.py 2011-08-15 21:16:40 +0000
315+++ carbon/lib/carbon/writer.py 2011-08-26 21:17:16 +0000
316@@ -15,6 +15,7 @@
317
318 import os
319 import time
320+import re
321 from os.path import join, exists, dirname, basename
322 from carbon import state
323
324@@ -41,6 +42,65 @@
325 CACHE_SIZE_LOW_WATERMARK = settings.MAX_CACHE_SIZE * 0.95
326
327
328+class RegexList:
329+ """ List of regexs """
330+
331+ def __init__(self, listFile):
332+ self.regexs = []
333+ self.listFile = listFile
334+ self.mtime = None
335+ self.lastUpdate = time.time()
336+ self.update()
337+
338+ def update(self):
339+ """ If the list file has changed, reload it """
340+ self.lastUpdate = time.time()
341+ if not os.path.exists(self.listFile):
342+ self.regexs = []
343+ self.mtime = None
344+ return
345+ mtime = os.stat(self.listFile).st_mtime
346+ if mtime > self.mtime:
347+ self.mtime = mtime
348+ """ Load the list file """
349+ self.regexs = []
350+ self.mtime = os.stat(self.listFile).st_mtime
351+ file = open(self.listFile)
352+ patterns = file.read().split('\n')
353+ file.close()
354+ for pattern in patterns:
355+ p = pattern.strip()
356+ if p and p[0] != '#':
357+ self.regexs.append(re.compile(p))
358+
359+ def in_list(self, value):
360+ """ Test if regexs is listed """
361+ if time.time() - self.lastUpdate > 5:
362+ self.update()
363+ for regex in self.regexs:
364+ if regex.search(value):
365+ return True
366+ return False
367+
368+ def matching(self, value):
369+ """ Returns the regular expressions that matches a value """
370+ matches = []
371+ for regex in self.regexs:
372+ if regex.search(value):
373+ matches.append(regex.pattern)
374+ return matches
375+
376+# Load whitelist and blacklist
377+if settings.WHITELIST:
378+ whitelist = RegexList(settings.WHITELIST)
379+else:
380+ whitelist = None
381+if settings.BLACKLIST:
382+ blacklist = RegexList(settings.BLACKLIST)
383+else:
384+ blacklist = None
385+
386+
387 def optimalWriteOrder():
388 "Generates metrics with the most cached values first and applies a soft rate limit on new metrics"
389 global lastCreateInterval
390@@ -96,6 +156,16 @@
391 for (metric, datapoints, dbFilePath, dbFileExists) in optimalWriteOrder():
392 dataWritten = True
393
394+ # Skip metrics that are blacklisted or not in the whitelist
395+ if blacklist and blacklist.in_list(metric):
396+ if settings.LOG_BLACKLIST:
397+ log.msg('metric %s is blacklisted' % metric)
398+ continue
399+ if whitelist and not whitelist.in_list(metric):
400+ if settings.LOG_WHITELIST:
401+ log.msg('metric %s is not whitelisted' % metric)
402+ continue
403+
404 if not dbFileExists:
405 archiveConfig = None
406