Merge lp:~mleinartas/graphite/whitelist into lp:~graphite-dev/graphite/main

Proposed by Michael Leinartas
Status: Merged
Merged at revision: 675
Proposed branch: lp:~mleinartas/graphite/whitelist
Merge into: lp:~graphite-dev/graphite/main
Diff against target: 232 lines (+127/-0)
8 files modified
carbon/conf/blacklist.conf.example (+4/-0)
carbon/conf/carbon.conf.example (+15/-0)
carbon/conf/whitelist.conf.example (+5/-0)
carbon/lib/carbon/conf.py (+19/-0)
carbon/lib/carbon/protocols.py (+7/-0)
carbon/lib/carbon/regexlist.py (+62/-0)
carbon/lib/carbon/service.py (+5/-0)
docs/config-carbon.rst (+10/-0)
To merge this branch: bzr merge lp:~mleinartas/graphite/whitelist
Reviewer Review Type Date Requested Status
chrismd Pending
Review via email: mp+85703@code.launchpad.net

This proposal supersedes a proposal from 2011-01-10.

Description of the change

We would be completely overwhelmed by the volume of metrics flowing into Graphite if we accepted everything that came our way. I can't control what is sent to Graphite, but I can control what is accepted into Graphite. To implement that, we've created whitelist and blacklist functionality that respectively allows or denies metrics entry into Graphite. This code is based on code that has been running in production for months with several thousand metrics per minute after filtering.

Each list is a file with one regular expression per line. If a metric matches a regex in the blacklist it is dropped. Only if a metric matches a regex in the whitelist will it be accepted by carbon.

For example, a whitelist consisting of the line:
^server\.load\.
will allow any metric that starts with "server.load." into carbon, unless it also matches a regex in the blacklist.

To test, uncomment the WHITELIST and/or BLACKLIST lines in carbon.conf and add one or more regular expressions to graphite/conf/whitelist and/or graphite/conf/blacklist. Restart carbon for your changes to take effect. Metrics which are rejected are logged in listener.log.

**UPDATE**
I've submitted a branch with trunk merged back in and with some changes from the original patch. This should get another review before merging. This will still be compatible with the patch our friends at Orbitz are running, but will require them to add and enable the global USE_WHITELIST flag.

Some things to validate:
 * whitelist and blacklist configs are by default in CONF_DIR/whitelist and CONF_DIR/blacklist. This seems like a good place, but would it make more sense to use a .conf file for consistency with the rest of the files in that directory?
* I moved the RegexList class to it's own file - regexlist.py
* Functionality can be globally enabled/disabled with the USE_WHITELIST flag in carbon.conf. Removing the conf files also effectively disables it, but an additional global flag feels appropriate

Also note that this makes some changes to config - I can merge this in with the megacarbon branch separately after it's merged to trunk, or I could branch it off of megacarbon instead if that's better.

To post a comment you must log in.
Revision history for this message
Jon Stevens (latchkey-gmail) wrote : Posted in a previous version of this proposal

I personally think that this should be done through a separate proxy sitting in front of carbon-cache.py. It seems like by the time carbon accepts the request and parses it, you might as well just record it.

Revision history for this message
Eric Ziegenhorn (ziggy) wrote : Posted in a previous version of this proposal

But that takes storage space, the time to create it, etc. I agree that
would be a good architecture though if a single proxy could handle the flow
in real time, and I'd guess it could in most cases. Note that filtering
happens right when the request hits carbon though, not after much
processing. It never gets to the metric queue if it isn't whitelisted.

On Mon, Jan 10, 2011 at 12:13 PM, Jon Stevens <email address hidden> wrote:

> I personally think that this should be done through a separate proxy
> sitting in front of carbon-cache.py. It seems like by the time carbon
> accepts the request and parses it, you might as well just record it.
> --
> https://code.launchpad.net/~ziggy/graphite/whitelist/+merge/45710<https://code.launchpad.net/%7Eziggy/graphite/whitelist/+merge/45710>
> You are the owner of lp:~ziggy/graphite/whitelist.
>

Revision history for this message
chrismd (chrismd) wrote : Posted in a previous version of this proposal

Sorry for the delayed response. Actually carbon-cache already has a whitelist feature, it broke a while back because a default value for its config option went away somehow but I just fixed that in trunk. Anyways, it uses an exact list of metric names for whitelisting and doesn't provide blacklisting so your patch definitely adds value.

However you use a module I've never heard of, regexlist. I would prefer to not add another thirdparty dependency if possible and in this case I think it would be pretty simple to write a quick function to read the regexes from a file into a list. Also the diffs against carbon-cache.py no longer apply since Sidnei's refactoring to use twistd will be merged soon.

Thanks for submitting this, I'm sure many will find it useful once we get it merged in

review: Needs Fixing
Revision history for this message
Eric Ziegenhorn (ziggy) wrote : Posted in a previous version of this proposal

I wrote regexlist module specifically for this feature. I could in-line it
somewhere if that is preferrable. I can update it to apply to trunk in the
not-too-distant future if there is nothing else to re-work first, just let
me know. Thanks!

On Sun, Jul 10, 2011 at 5:22 PM, chrismd <email address hidden> wrote:

> Review: Needs Fixing
> Sorry for the delayed response. Actually carbon-cache already has a
> whitelist feature, it broke a while back because a default value for its
> config option went away somehow but I just fixed that in trunk. Anyways, it
> uses an exact list of metric names for whitelisting and doesn't provide
> blacklisting so your patch definitely adds value.
>
> However you use a module I've never heard of, regexlist. I would prefer to
> not add another thirdparty dependency if possible and in this case I think
> it would be pretty simple to write a quick function to read the regexes from
> a file into a list. Also the diffs against carbon-cache.py no longer apply
> since Sidnei's refactoring to use twistd will be merged soon.
>
> Thanks for submitting this, I'm sure many will find it useful once we get
> it merged in
> --
> https://code.launchpad.net/~ziggy/graphite/whitelist/+merge/45710<https://code.launchpad.net/%7Eziggy/graphite/whitelist/+merge/45710>
> You are the owner of lp:~ziggy/graphite/whitelist.
>

Revision history for this message
chrismd (chrismd) wrote : Posted in a previous version of this proposal

Sure just inline the function. The only thing I'd change is the if 'BLACKLIST' in settings check being done for each line received, ideally that lookup only need to be done once outside the lineReceived function.

> I wrote regexlist module specifically for this feature. I could in-line it
> somewhere if that is preferrable. I can update it to apply to trunk in the
> not-too-distant future if there is nothing else to re-work first, just let
> me know. Thanks!
>

Revision history for this message
Eric Ziegenhorn (ziggy) wrote : Posted in a previous version of this proposal

I in-lined the RegexList class in writer.py as per your request. I changed the checks in writeCachedDataPoints() to try and make them faster especially in the case you're not using a whitelist or blacklist, but I don't see a way to get rid of them entirely. Feel free to enlighten me. :) I added more config settings for logging related to the whitelist or blacklist because it can be very useful to know where your metrics are going but sometimes it is too slow. I can tweak/get rid of that as you'd prefer. I can write proper documentation for this for the wiki if and when you want. I have this refactoring running in dev, but will push out to staging and production for further testing if I'm heading in the right direction for getting it upstream.

Additionally, I made a small standalone web interface for checking a metric to see if it would be whitelisted or blacklisted and if so, which regexes it matches. This has been helpful because when you get lots of whitelist or blacklist entries it can be hard to tell if a given metric matches or not. It's ~130 lines of python (+~2700 lines for the bottle.py micro-framework). Let me know if that'd be useful to include or reference with Graphite.

Revision history for this message
chrismd (chrismd) wrote : Posted in a previous version of this proposal

Hi Ziggy, thanks for the updates. Unfortunately this patch doesn't apply cleanly to trunk because of all the recent carbon refactoring (ie. carbon-cache.py on trunk is 100% different than the source this branch derives from). It probably won't take much to sync it with current trunk but that should definitely happen as a trunk merge into this branch rather than the other way around, at least at first. Can you 'bzr merge lp:graphite' into your branch, resolve the conflicts and verify everything still works? If we can get this branch into a state where it cleanly merges into trunk then we'll be good to go. Thanks.

lp:~mleinartas/graphite/whitelist updated
623. By Michael Leinartas

add whitelist documentation

624. By Michael Leinartas

add instrumentation for whitelist/blacklist matches

Revision history for this message
Eric Ziegenhorn (ziggy) wrote :

I've been trying to get around to this but haven't found time yet...thanks for getting it cleaned up and ready for trunk Michael!

lp:~mleinartas/graphite/whitelist updated
625. By Michael Leinartas

use .conf extention for whitelist and blacklist for consistency

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== added file 'carbon/conf/blacklist.conf.example'
2--- carbon/conf/blacklist.conf.example 1970-01-01 00:00:00 +0000
3+++ carbon/conf/blacklist.conf.example 2012-01-04 18:43:36 +0000
4@@ -0,0 +1,4 @@
5+# This file takes a single regular expression per line
6+# If USE_WHITELIST is set to True in carbon.conf, any metrics received which
7+# match one of these expressions will be dropped
8+^some\.noisy\.metric\.prefix\..*
9
10=== modified file 'carbon/conf/carbon.conf.example'
11--- carbon/conf/carbon.conf.example 2011-12-03 08:12:58 +0000
12+++ carbon/conf/carbon.conf.example 2012-01-04 18:43:36 +0000
13@@ -92,6 +92,11 @@
14 # shift the onus of buffering writes from the kernel into carbon's cache.
15 WHISPER_AUTOFLUSH = False
16
17+# Set this to True to enable whitelisting and blacklisting of metrics in
18+# CONF_DIR/whitelist and CONF_DIR/blacklist. If the whitelist is missing or
19+# empty, all metrics will pass through
20+# USE_WHITELIST = False
21+
22 # Enable AMQP if you want to receve metrics using an amqp broker
23 # ENABLE_AMQP = False
24
25@@ -180,6 +185,11 @@
26 # data until the send queues fall below 80% MAX_QUEUE_SIZE.
27 USE_FLOW_CONTROL = True
28
29+# Set this to True to enable whitelisting and blacklisting of metrics in
30+# CONF_DIR/whitelist and CONF_DIR/blacklist. If the whitelist is missing or
31+# empty, all metrics will pass through
32+# USE_WHITELIST = False
33+
34
35 [aggregator]
36 LINE_RECEIVER_INTERFACE = 0.0.0.0
37@@ -227,3 +237,8 @@
38 # each metric. Aggregation only happens for datapoints that fall in
39 # the past MAX_AGGREGATION_INTERVALS * intervalSize seconds.
40 MAX_AGGREGATION_INTERVALS = 5
41+
42+# Set this to True to enable whitelisting and blacklisting of metrics in
43+# CONF_DIR/whitelist and CONF_DIR/blacklist. If the whitelist is missing or
44+# empty, all metrics will pass through
45+# USE_WHITELIST = False
46
47=== added file 'carbon/conf/whitelist.conf.example'
48--- carbon/conf/whitelist.conf.example 1970-01-01 00:00:00 +0000
49+++ carbon/conf/whitelist.conf.example 2012-01-04 18:43:36 +0000
50@@ -0,0 +1,5 @@
51+# This file takes a single regular expression per line
52+# If USE_WHITELIST is set to True in carbon.conf, only metrics received which
53+# match one of these expressions will be persisted. If this file is empty or
54+# missing, all metrics will pass through
55+.*
56
57=== modified file 'carbon/lib/carbon/conf.py'
58--- carbon/lib/carbon/conf.py 2011-12-19 18:02:23 +0000
59+++ carbon/lib/carbon/conf.py 2012-01-04 18:43:36 +0000
60@@ -59,6 +59,7 @@
61 DESTINATIONS=[],
62 USE_FLOW_CONTROL=True,
63 USE_INSECURE_UNPICKLER=False,
64+ USE_WHITELIST=False,
65 )
66
67
68@@ -158,6 +159,8 @@
69 ["config", "c", None, "Use the given config file."],
70 ["instance", "", "a", "Manage a specific carbon instance."],
71 ["logdir", "", None, "Write logs to the given directory."],
72+ ["whitelist", "", None, "List of metric patterns to allow."],
73+ ["blacklist", "", None, "List of metric patterns to disallow."],
74 ]
75
76 def postOptions(self):
77@@ -217,6 +220,14 @@
78 os.makedirs(logdir)
79 log.logToDir(logdir)
80
81+ if self["whitelist"] is None:
82+ self["whitelist"] = join(settings["CONF_DIR"], "whitelist.conf")
83+ settings["whitelist"] = self["whitelist"]
84+
85+ if self["blacklist"] is None:
86+ self["blacklist"] = join(settings["CONF_DIR"], "blacklist.conf")
87+ settings["blacklist"] = self["blacklist"]
88+
89 def parseArgs(self, *action):
90 """If an action was provided, store it for further processing."""
91 if len(action) == 1:
92@@ -346,6 +357,14 @@
93 default=None,
94 help="Use the given config file")
95 parser.add_option(
96+ "--whitelist",
97+ default=None,
98+ help="Use the given whitelist file")
99+ parser.add_option(
100+ "--blacklist",
101+ default=None
102+ help="Use the given blacklist file")
103+ parser.add_option(
104 "--logdir",
105 default=None,
106 help="Write logs in the given directory")
107
108=== modified file 'carbon/lib/carbon/protocols.py'
109--- carbon/lib/carbon/protocols.py 2011-10-06 09:22:43 +0000
110+++ carbon/lib/carbon/protocols.py 2012-01-04 18:43:36 +0000
111@@ -4,6 +4,7 @@
112 from twisted.protocols.basic import LineOnlyReceiver, Int32StringReceiver
113 from carbon import log, events, state, management
114 from carbon.conf import settings
115+from carbon.regexlist import WhiteList, BlackList
116 from carbon.util import pickle, get_unpickler
117
118
119@@ -46,6 +47,12 @@
120 events.resumeReceivingMetrics.removeHandler(self.resumeReceiving)
121
122 def metricReceived(self, metric, datapoint):
123+ if BlackList and metric in BlackList:
124+ instrumentation.increment('blackListMatches')
125+ return
126+ if WhiteList and metric not in WhiteList:
127+ instrumentation.increment('whiteListMatches')
128+ return
129 if datapoint[1] == datapoint[1]: # filter out NaN values
130 events.metricReceived(metric, datapoint)
131
132
133=== added file 'carbon/lib/carbon/regexlist.py'
134--- carbon/lib/carbon/regexlist.py 1970-01-01 00:00:00 +0000
135+++ carbon/lib/carbon/regexlist.py 2012-01-04 18:43:36 +0000
136@@ -0,0 +1,62 @@
137+import time
138+import re
139+import os.path
140+from carbon import log
141+from twisted.internet.task import LoopingCall
142+
143+
144+class RegexList:
145+ """ Maintain a list of regex for matching whitelist and blacklist """
146+
147+ def __init__(self):
148+ self.regex_list = []
149+ self.list_file = None
150+ self.read_task = LoopingCall(self.read_list)
151+ self.rules_last_read = 0.0
152+
153+ def read_from(self, list_file):
154+ self.list_file = list_file
155+ self.read_list()
156+ self.read_task.start(10, now=False)
157+
158+ def read_list(self):
159+ # Clear rules and move on if file isn't there
160+ if not os.path.exists(self.list_file):
161+ self.regex_list = []
162+ return
163+
164+ try:
165+ mtime = os.path.getmtime(self.list_file)
166+ except:
167+ log.err("Failed to get mtime of %s" % self.list_file)
168+ return
169+
170+ if mtime <= self.rules_last_read:
171+ return
172+
173+ # Begin read
174+ new_regex_list = []
175+ for line in open(self.list_file):
176+ pattern = line.strip()
177+ if line.startswith('#') or not line:
178+ continue
179+ try:
180+ new_regex_list.append(re.compile(pattern))
181+ except:
182+ log.err("Failed to parse '%s' in '%s'. Ignoring line" % (pattern, self.list_file))
183+
184+ self.regex_list = new_regex_list
185+ self.rules_last_read = mtime
186+
187+ def __contains__(self, value):
188+ for regex in self.regex_list:
189+ if regex.search(value):
190+ return True
191+ return False
192+
193+ def __nonzero__(self):
194+ return bool(self.regex_list)
195+
196+
197+WhiteList = RegexList()
198+BlackList = RegexList()
199
200=== modified file 'carbon/lib/carbon/service.py'
201--- carbon/lib/carbon/service.py 2011-09-18 09:55:35 +0000
202+++ carbon/lib/carbon/service.py 2012-01-04 18:43:36 +0000
203@@ -94,6 +94,11 @@
204 interface=settings.MANHOLE_INTERFACE)
205 service.setServiceParent(root_service)
206
207+ if settings.USE_WHITELIST:
208+ from carbon.regexlist import WhiteList,BlackList
209+ WhiteList.read_from(settings["whitelist"])
210+ BlackList.read_from(settings["blacklist"])
211+
212 # Instantiate an instrumentation service that will record metrics about
213 # this service.
214 from carbon.instrumentation import InstrumentationService
215
216=== modified file 'docs/config-carbon.rst'
217--- docs/config-carbon.rst 2011-09-20 17:12:37 +0000
218+++ docs/config-carbon.rst 2012-01-04 18:43:36 +0000
219@@ -152,3 +152,13 @@
220 aggregate metric 'prod.applications.apache.all.requests' would be calculated
221 by summing their values.
222
223+whitelist and blacklist
224+-----------------------
225+The whitelist functionality allows any of the carbon daemons to only accept metrics that are explicitly
226+whitelisted and/or to reject blacklisted metrics. The functionality can be enabled in carbon.conf with
227+the ``USE_WHITELIST`` flag. This can be useful when too many metrics are being sent to a Graphite
228+instance or when there are metric senders sending useless or invalid metrics.
229+
230+The whitelist and blacklist files are located in ``GRAPHITE_CONF_DIR``. Each file contains one regular
231+expressions per line to match against metric values. If the whitelist configuration is missing or empty,
232+all metrics will be passed through.

Subscribers

People subscribed via source and target branches