Merge lp:~ziggy/graphite/whitelist into lp:~graphite-dev/graphite/main

Proposed by Eric Ziegenhorn
Status: Superseded
Proposed branch: lp:~ziggy/graphite/whitelist
Merge into: lp:~graphite-dev/graphite/main
Diff against target: 405 lines (+304/-0) (has conflicts)
5 files modified
carbon/bin/carbon-cache.py (+162/-0)
carbon/conf/carbon.conf.example (+12/-0)
carbon/conf/whitelist (+1/-0)
carbon/lib/carbon/protocols.py (+59/-0)
carbon/lib/carbon/writer.py (+70/-0)
Text conflict in carbon/bin/carbon-cache.py
Text conflict in carbon/conf/carbon.conf.example
Text conflict in carbon/lib/carbon/protocols.py
To merge this branch: bzr merge lp:~ziggy/graphite/whitelist
Reviewer Review Type Date Requested Status
chrismd Needs Fixing
Review via email: mp+45710@code.launchpad.net

This proposal has been superseded by a proposal from 2011-12-14.

Description of the change

We would be completely overwhelmed by the volume of metrics flowing into Graphite if we accepted everything that came our way. I can't control what is sent to Graphite, but I can control what is accepted into Graphite. To implement that, we've created whitelist and blacklist functionality that respectively allows or denies metrics entry into Graphite. This code is based on code that has been running in production for months with several thousand metrics per minute after filtering.

Each list is a file with one regular expression per line. If a metric matches a regex in the blacklist it is dropped. Only if a metric matches a regex in the whitelist will it be accepted by carbon.

For example, a whitelist consisting of the line:
^server\.load\.
will allow any metric that starts with "server.load." into carbon, unless it also matches a regex in the blacklist.

To test, uncomment the WHITELIST and/or BLACKLIST lines in carbon.conf and add one or more regular expressions to graphite/conf/whitelist and/or graphite/conf/blacklist. Restart carbon for your changes to take effect. Metrics which are rejected are logged in listener.log.

To post a comment you must log in.
Revision history for this message
Jon Stevens (latchkey-gmail) wrote :

I personally think that this should be done through a separate proxy sitting in front of carbon-cache.py. It seems like by the time carbon accepts the request and parses it, you might as well just record it.

Revision history for this message
Eric Ziegenhorn (ziggy) wrote :

But that takes storage space, the time to create it, etc. I agree that
would be a good architecture though if a single proxy could handle the flow
in real time, and I'd guess it could in most cases. Note that filtering
happens right when the request hits carbon though, not after much
processing. It never gets to the metric queue if it isn't whitelisted.

On Mon, Jan 10, 2011 at 12:13 PM, Jon Stevens <email address hidden> wrote:

> I personally think that this should be done through a separate proxy
> sitting in front of carbon-cache.py. It seems like by the time carbon
> accepts the request and parses it, you might as well just record it.
> --
> https://code.launchpad.net/~ziggy/graphite/whitelist/+merge/45710<https://code.launchpad.net/%7Eziggy/graphite/whitelist/+merge/45710>
> You are the owner of lp:~ziggy/graphite/whitelist.
>

Revision history for this message
chrismd (chrismd) wrote :

Sorry for the delayed response. Actually carbon-cache already has a whitelist feature, it broke a while back because a default value for its config option went away somehow but I just fixed that in trunk. Anyways, it uses an exact list of metric names for whitelisting and doesn't provide blacklisting so your patch definitely adds value.

However you use a module I've never heard of, regexlist. I would prefer to not add another thirdparty dependency if possible and in this case I think it would be pretty simple to write a quick function to read the regexes from a file into a list. Also the diffs against carbon-cache.py no longer apply since Sidnei's refactoring to use twistd will be merged soon.

Thanks for submitting this, I'm sure many will find it useful once we get it merged in

review: Needs Fixing
Revision history for this message
Eric Ziegenhorn (ziggy) wrote :

I wrote regexlist module specifically for this feature. I could in-line it
somewhere if that is preferrable. I can update it to apply to trunk in the
not-too-distant future if there is nothing else to re-work first, just let
me know. Thanks!

On Sun, Jul 10, 2011 at 5:22 PM, chrismd <email address hidden> wrote:

> Review: Needs Fixing
> Sorry for the delayed response. Actually carbon-cache already has a
> whitelist feature, it broke a while back because a default value for its
> config option went away somehow but I just fixed that in trunk. Anyways, it
> uses an exact list of metric names for whitelisting and doesn't provide
> blacklisting so your patch definitely adds value.
>
> However you use a module I've never heard of, regexlist. I would prefer to
> not add another thirdparty dependency if possible and in this case I think
> it would be pretty simple to write a quick function to read the regexes from
> a file into a list. Also the diffs against carbon-cache.py no longer apply
> since Sidnei's refactoring to use twistd will be merged soon.
>
> Thanks for submitting this, I'm sure many will find it useful once we get
> it merged in
> --
> https://code.launchpad.net/~ziggy/graphite/whitelist/+merge/45710<https://code.launchpad.net/%7Eziggy/graphite/whitelist/+merge/45710>
> You are the owner of lp:~ziggy/graphite/whitelist.
>

Revision history for this message
chrismd (chrismd) wrote :

Sure just inline the function. The only thing I'd change is the if 'BLACKLIST' in settings check being done for each line received, ideally that lookup only need to be done once outside the lineReceived function.

> I wrote regexlist module specifically for this feature. I could in-line it
> somewhere if that is preferrable. I can update it to apply to trunk in the
> not-too-distant future if there is nothing else to re-work first, just let
> me know. Thanks!
>

lp:~ziggy/graphite/whitelist updated
337. By Eric Ziegenhorn

Added config options for logging whitelist/blacklist misses/hits.
Moved RegexList class in to writer.py

338. By Eric Ziegenhorn

Added a default whitelist entry to allow everything in

339. By Eric Ziegenhorn

Add support for comments in whitelist/blacklist files

340. By Eric Ziegenhorn

Removed explicit directory chmod as that is not supposed to be part of this tree

Revision history for this message
Eric Ziegenhorn (ziggy) wrote :

I in-lined the RegexList class in writer.py as per your request. I changed the checks in writeCachedDataPoints() to try and make them faster especially in the case you're not using a whitelist or blacklist, but I don't see a way to get rid of them entirely. Feel free to enlighten me. :) I added more config settings for logging related to the whitelist or blacklist because it can be very useful to know where your metrics are going but sometimes it is too slow. I can tweak/get rid of that as you'd prefer. I can write proper documentation for this for the wiki if and when you want. I have this refactoring running in dev, but will push out to staging and production for further testing if I'm heading in the right direction for getting it upstream.

Additionally, I made a small standalone web interface for checking a metric to see if it would be whitelisted or blacklisted and if so, which regexes it matches. This has been helpful because when you get lots of whitelist or blacklist entries it can be hard to tell if a given metric matches or not. It's ~130 lines of python (+~2700 lines for the bottle.py micro-framework). Let me know if that'd be useful to include or reference with Graphite.

Revision history for this message
chrismd (chrismd) wrote :

Hi Ziggy, thanks for the updates. Unfortunately this patch doesn't apply cleanly to trunk because of all the recent carbon refactoring (ie. carbon-cache.py on trunk is 100% different than the source this branch derives from). It probably won't take much to sync it with current trunk but that should definitely happen as a trunk merge into this branch rather than the other way around, at least at first. Can you 'bzr merge lp:graphite' into your branch, resolve the conflicts and verify everything still works? If we can get this branch into a state where it cleanly merges into trunk then we'll be good to go. Thanks.

Unmerged revisions

340. By Eric Ziegenhorn

Removed explicit directory chmod as that is not supposed to be part of this tree

339. By Eric Ziegenhorn

Add support for comments in whitelist/blacklist files

338. By Eric Ziegenhorn

Added a default whitelist entry to allow everything in

337. By Eric Ziegenhorn

Added config options for logging whitelist/blacklist misses/hits.
Moved RegexList class in to writer.py

336. By Eric Ziegenhorn

Work with WHITELIST and/or BLACKLIST commented out in carbon.conf

335. By Eric Ziegenhorn

Improved comments in example carbon.conf

334. By Eric Ziegenhorn

Added blank whitelist and blacklist files

333. By Eric Ziegenhorn

Added whitelist/blacklist functionality

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'carbon/bin/carbon-cache.py'
--- carbon/bin/carbon-cache.py 2011-07-11 19:24:57 +0000
+++ carbon/bin/carbon-cache.py 2011-08-26 21:17:16 +0000
@@ -25,6 +25,168 @@
25LIB_DIR = join(ROOT_DIR, 'lib')25LIB_DIR = join(ROOT_DIR, 'lib')
26sys.path.insert(0, LIB_DIR)26sys.path.insert(0, LIB_DIR)
2727
28<<<<<<< TREE
28from carbon.util import run_twistd_plugin29from carbon.util import run_twistd_plugin
2930
30run_twistd_plugin(__file__)31run_twistd_plugin(__file__)
32=======
33# Capture useful debug info for this commonly reported problem
34try:
35 import carbon
36except ImportError:
37 print 'Failed to import carbon, debug information follows.'
38 print 'pwd=%s' % os.getcwd()
39 print 'sys.path=%s' % sys.path
40 print '__file__=%s' % __file__
41 sys.exit(1)
42
43
44# Parse command line options
45parser = optparse.OptionParser(usage='%prog [options] <start|stop|status>')
46parser.add_option('--debug', action='store_true', help='Run in the foreground, log to stdout')
47parser.add_option('--profile', help='Record performance profile data to the given file')
48parser.add_option('--pidfile', default=join(STORAGE_DIR, '%s.pid' % program.split('.')[0]), help='Write pid to the given file')
49parser.add_option('--config', default=join(CONF_DIR, 'carbon.conf'), help='Use the given config file')
50parser.add_option('--logdir', default=LOG_DIR, help="Write logs in the given directory")
51
52(options, args) = parser.parse_args()
53
54if not args:
55 parser.print_usage()
56 raise SystemExit(1)
57
58action = args[0]
59
60if action == 'stop':
61 if not exists(options.pidfile):
62 print 'Pidfile %s does not exist' % options.pidfile
63 raise SystemExit(0)
64
65 pidfile = open(options.pidfile, 'r')
66 try:
67 pid = int( pidfile.read().strip() )
68 except:
69 print 'Could not read pidfile %s' % options.pidfile
70 raise SystemExit(1)
71
72 print 'Deleting %s (contained pid %d)' % (options.pidfile, pid)
73 os.unlink(options.pidfile)
74
75 print 'Sending kill signal to pid %d' % pid
76 os.kill(pid, 15)
77 raise SystemExit(0)
78
79
80elif action == 'status':
81 if not exists(options.pidfile):
82 print '%s is not running' % program
83 raise SystemExit(0)
84
85 pidfile = open(options.pidfile, 'r')
86 try:
87 pid = int( pidfile.read().strip() )
88 except:
89 print 'Failed to read pid from %s' % options.pidfile
90 raise SystemExit(1)
91
92 if exists('/proc/%d' % pid):
93 print "%s is running with pid %d" % (program, pid)
94 raise SystemExit(0)
95 else:
96 print "%s is not running" % program
97 raise SystemExit(0)
98
99elif action != 'start':
100 parser.print_usage()
101 raise SystemExit(1)
102
103
104if exists(options.pidfile):
105 print "Pidfile %s already exists, is %s already running?" % (options.pidfile, program)
106 raise SystemExit(1)
107
108# Import application components
109from carbon.conf import settings
110# Read config (we want failures to occur before daemonizing) and settings is needed in some imported modules below
111settings.readFrom(options.config, 'cache')
112from carbon.log import logToStdout, logToDir
113from carbon.listeners import MetricLineReceiver, MetricPickleReceiver, CacheQueryHandler, startListener
114from carbon.cache import MetricCache
115from carbon.writer import startWriter
116from carbon.instrumentation import startRecordingCacheMetrics
117from carbon.events import metricReceived
118
119use_amqp = settings.get("ENABLE_AMQP", False)
120if use_amqp:
121 from carbon import amqp_listener
122 amqp_host = settings.get("AMQP_HOST", "localhost")
123 amqp_port = settings.get("AMQP_PORT", 5672)
124 amqp_user = settings.get("AMQP_USER", "guest")
125 amqp_password = settings.get("AMQP_PASSWORD", "guest")
126 amqp_verbose = settings.get("AMQP_VERBOSE", False)
127 amqp_vhost = settings.get("AMQP_VHOST", "/")
128 amqp_spec = settings.get("AMQP_SPEC", None)
129 amqp_exchange_name = settings.get("AMQP_EXCHANGE", "graphite")
130
131
132# --debug
133if options.debug:
134 logToStdout()
135
136else:
137 if not isdir(options.logdir):
138 os.makedirs(options.logdir)
139
140 if settings.USER:
141 print "Dropping privileges to become the user %s" % settings.USER
142
143 from carbon.util import daemonize, dropprivs
144 daemonize()
145
146 pidfile = open(options.pidfile, 'w')
147 pidfile.write( str(os.getpid()) )
148 pidfile.close()
149
150 def shutdown():
151 if os.path.exists(options.pidfile):
152 os.unlink(options.pidfile)
153
154 atexit.register(shutdown)
155
156 if settings.USER:
157 pwent = pwd.getpwnam(settings.USER)
158 os.chown(options.pidfile, pwent.pw_uid, pwent.pw_gid)
159 dropprivs(settings.USER)
160
161 logToDir(options.logdir)
162
163# Configure application components
164metricReceived.installHandler(MetricCache.store)
165startListener(settings.LINE_RECEIVER_INTERFACE, settings.LINE_RECEIVER_PORT, MetricLineReceiver)
166startListener(settings.PICKLE_RECEIVER_INTERFACE, settings.PICKLE_RECEIVER_PORT, MetricPickleReceiver)
167startListener(settings.CACHE_QUERY_INTERFACE, settings.CACHE_QUERY_PORT, CacheQueryHandler)
168
169if use_amqp:
170 amqp_listener.startReceiver(amqp_host, amqp_port, amqp_user, amqp_password,
171 vhost=amqp_vhost, spec=amqp_spec,
172 exchange_name=amqp_exchange_name,
173 verbose=amqp_verbose)
174
175startWriter()
176startRecordingCacheMetrics()
177
178
179# Run the twisted reactor
180print "%s running" % program
181
182if options.profile:
183 import cProfile
184
185 if exists(options.profile):
186 os.unlink(options.profile)
187
188 cProfile.run('reactor.run()', options.profile)
189
190else:
191 reactor.run()
192>>>>>>> MERGE-SOURCE
31193
=== added file 'carbon/conf/blacklist'
=== modified file 'carbon/conf/carbon.conf.example'
--- carbon/conf/carbon.conf.example 2011-08-15 07:09:56 +0000
+++ carbon/conf/carbon.conf.example 2011-08-26 21:17:16 +0000
@@ -82,10 +82,22 @@
82# degrade performance if logging on the same volume as the whisper data is stored.82# degrade performance if logging on the same volume as the whisper data is stored.
83LOG_UPDATES = False83LOG_UPDATES = False
8484
85<<<<<<< TREE
85# On some systems it is desirable for whisper to write synchronously.86# On some systems it is desirable for whisper to write synchronously.
86# Set this option to True if you'd like to try this. Basically it will87# Set this option to True if you'd like to try this. Basically it will
87# shift the onus of buffering writes from the kernel into carbon's cache.88# shift the onus of buffering writes from the kernel into carbon's cache.
88WHISPER_AUTOFLUSH = False89WHISPER_AUTOFLUSH = False
90=======
91# Adding regular expressions to the whitelist makes carbon accept only matching metrics
92#WHITELIST = /opt/graphite/conf/whitelist
93# Logging metrics that don't match the whitelist can be useful but can degrade performance
94#LOG_WHITELIST = False
95
96# Adding regular expressions to the blacklist prevents carbon from accepting matching metrics
97#BLACKLIST = /opt/graphite/conf/blacklist
98# Logging metrics that match the blacklist can be useful but can degrade performance
99#LOG_BLACKLIST = False
100>>>>>>> MERGE-SOURCE
89101
90# Enable AMQP if you want to receve metrics using an amqp broker102# Enable AMQP if you want to receve metrics using an amqp broker
91# ENABLE_AMQP = False103# ENABLE_AMQP = False
92104
=== added file 'carbon/conf/whitelist'
--- carbon/conf/whitelist 1970-01-01 00:00:00 +0000
+++ carbon/conf/whitelist 2011-08-26 21:17:16 +0000
@@ -0,0 +1,1 @@
1.*
02
=== modified file 'carbon/lib/carbon/protocols.py'
--- carbon/lib/carbon/protocols.py 2011-08-18 08:48:19 +0000
+++ carbon/lib/carbon/protocols.py 2011-08-26 21:17:16 +0000
@@ -2,13 +2,28 @@
2from twisted.internet.protocol import DatagramProtocol2from twisted.internet.protocol import DatagramProtocol
3from twisted.internet.error import ConnectionDone3from twisted.internet.error import ConnectionDone
4from twisted.protocols.basic import LineOnlyReceiver, Int32StringReceiver4from twisted.protocols.basic import LineOnlyReceiver, Int32StringReceiver
5<<<<<<< TREE
5from carbon import log, events, state, management6from carbon import log, events, state, management
7=======
8from carbon.cache import MetricCache
9from carbon.relay import relay
10from carbon.instrumentation import increment
11from carbon.events import metricReceived
12from carbon import log
13from carbon.conf import settings
14import regexlist
15>>>>>>> MERGE-SOURCE
616
7try:17try:
8 import cPickle as pickle18 import cPickle as pickle
9except ImportError:19except ImportError:
10 import pickle20 import pickle
1121
22# Load whitelist and blacklist
23if 'WHITELIST' in settings:
24 whitelist = regexlist.RegexList(settings.WHITELIST)
25if 'BLACKLIST' in settings:
26 blacklist = regexlist.RegexList(settings.BLACKLIST)
1227
13class MetricReceiver:28class MetricReceiver:
14 """ Base class for all metric receiving protocols, handles flow29 """ Base class for all metric receiving protocols, handles flow
@@ -60,6 +75,7 @@
60 log.listener('invalid line received from client %s, ignoring' % self.peerName)75 log.listener('invalid line received from client %s, ignoring' % self.peerName)
61 return76 return
6277
78<<<<<<< TREE
63 events.metricReceived(metric, datapoint)79 events.metricReceived(metric, datapoint)
6480
6581
@@ -76,6 +92,21 @@
7692
7793
78class MetricPickleReceiver(MetricReceiver, Int32StringReceiver):94class MetricPickleReceiver(MetricReceiver, Int32StringReceiver):
95=======
96 if 'BLACKLIST' in settings and blacklist.in_list(metric):
97 log.listener('ignoring metric %s which is blacklisted' % metric)
98 return
99
100 if 'WHITELIST' in settings and not whitelist.in_list(metric):
101 log.listener('ignoring metric %s which is not whitelisted' % metric)
102 return
103
104 increment('metricsReceived')
105 metricReceived(metric, datapoint)
106
107
108class MetricPickleReceiver(LoggingMixin, Int32StringReceiver):
109>>>>>>> MERGE-SOURCE
79 MAX_LENGTH = 2 ** 20110 MAX_LENGTH = 2 ** 20
80111
81 def stringReceived(self, data):112 def stringReceived(self, data):
@@ -89,6 +120,7 @@
89 try:120 try:
90 datapoint = ( float(datapoint[0]), float(datapoint[1]) ) #force proper types121 datapoint = ( float(datapoint[0]), float(datapoint[1]) ) #force proper types
91 except:122 except:
123<<<<<<< TREE
92 continue124 continue
93125
94 if datapoint[1] == datapoint[1]: # filter out NaN values126 if datapoint[1] == datapoint[1]: # filter out NaN values
@@ -126,6 +158,33 @@
126 result = dict(error="Invalid request type \"%s\"" % request['type'])158 result = dict(error="Invalid request type \"%s\"" % request['type'])
127159
128 response = pickle.dumps(result, protocol=-1)160 response = pickle.dumps(result, protocol=-1)
161=======
162 log.listener('invalid pickle received from client %s, ignoring' % self.peerAddr)
163 continue
164
165 if 'BLACKLIST' in settings and blacklist.in_list(metric):
166 log.listener('ignoring metric %s which is blacklisted' % metric)
167 continue
168
169 if 'WHITELIST' in settings and not whitelist.in_list(metric):
170 log.listener('ignoring metric %s which is not whitelisted' % metric)
171 continue
172
173 if datapoint[1] != datapoint[1]: # filter out NaN values
174 log.listener('ingoring metric with NaN value, %s' % metric)
175 continue
176
177 metricReceived(metric, datapoint)
178
179 increment('metricsReceived', len(datapoints))
180
181
182class CacheQueryHandler(LoggingMixin, Int32StringReceiver):
183 def stringReceived(self, metric):
184 values = MetricCache.get(metric, [])
185 log.query('cache query for %s returned %d values' % (metric, len(values)))
186 response = pickle.dumps(values, protocol=-1)
187>>>>>>> MERGE-SOURCE
129 self.sendString(response)188 self.sendString(response)
130189
131190
132191
=== modified file 'carbon/lib/carbon/writer.py'
--- carbon/lib/carbon/writer.py 2011-08-15 21:16:40 +0000
+++ carbon/lib/carbon/writer.py 2011-08-26 21:17:16 +0000
@@ -15,6 +15,7 @@
1515
16import os16import os
17import time17import time
18import re
18from os.path import join, exists, dirname, basename19from os.path import join, exists, dirname, basename
19from carbon import state20from carbon import state
2021
@@ -41,6 +42,65 @@
41CACHE_SIZE_LOW_WATERMARK = settings.MAX_CACHE_SIZE * 0.9542CACHE_SIZE_LOW_WATERMARK = settings.MAX_CACHE_SIZE * 0.95
4243
4344
45class RegexList:
46 """ List of regexs """
47
48 def __init__(self, listFile):
49 self.regexs = []
50 self.listFile = listFile
51 self.mtime = None
52 self.lastUpdate = time.time()
53 self.update()
54
55 def update(self):
56 """ If the list file has changed, reload it """
57 self.lastUpdate = time.time()
58 if not os.path.exists(self.listFile):
59 self.regexs = []
60 self.mtime = None
61 return
62 mtime = os.stat(self.listFile).st_mtime
63 if mtime > self.mtime:
64 self.mtime = mtime
65 """ Load the list file """
66 self.regexs = []
67 self.mtime = os.stat(self.listFile).st_mtime
68 file = open(self.listFile)
69 patterns = file.read().split('\n')
70 file.close()
71 for pattern in patterns:
72 p = pattern.strip()
73 if p and p[0] != '#':
74 self.regexs.append(re.compile(p))
75
76 def in_list(self, value):
77 """ Test if regexs is listed """
78 if time.time() - self.lastUpdate > 5:
79 self.update()
80 for regex in self.regexs:
81 if regex.search(value):
82 return True
83 return False
84
85 def matching(self, value):
86 """ Returns the regular expressions that matches a value """
87 matches = []
88 for regex in self.regexs:
89 if regex.search(value):
90 matches.append(regex.pattern)
91 return matches
92
93# Load whitelist and blacklist
94if settings.WHITELIST:
95 whitelist = RegexList(settings.WHITELIST)
96else:
97 whitelist = None
98if settings.BLACKLIST:
99 blacklist = RegexList(settings.BLACKLIST)
100else:
101 blacklist = None
102
103
44def optimalWriteOrder():104def optimalWriteOrder():
45 "Generates metrics with the most cached values first and applies a soft rate limit on new metrics"105 "Generates metrics with the most cached values first and applies a soft rate limit on new metrics"
46 global lastCreateInterval106 global lastCreateInterval
@@ -96,6 +156,16 @@
96 for (metric, datapoints, dbFilePath, dbFileExists) in optimalWriteOrder():156 for (metric, datapoints, dbFilePath, dbFileExists) in optimalWriteOrder():
97 dataWritten = True157 dataWritten = True
98158
159 # Skip metrics that are blacklisted or not in the whitelist
160 if blacklist and blacklist.in_list(metric):
161 if settings.LOG_BLACKLIST:
162 log.msg('metric %s is blacklisted' % metric)
163 continue
164 if whitelist and not whitelist.in_list(metric):
165 if settings.LOG_WHITELIST:
166 log.msg('metric %s is not whitelisted' % metric)
167 continue
168
99 if not dbFileExists:169 if not dbFileExists:
100 archiveConfig = None170 archiveConfig = None
101171