Merge lp:~mkanat/loggerhead/raw-prefix into lp:loggerhead

Proposed by Max Kanat-Alexander
Status: Merged
Approved by: Max Kanat-Alexander
Approved revision: 442
Merged at revision: 440
Proposed branch: lp:~mkanat/loggerhead/raw-prefix
Merge into: lp:loggerhead
Prerequisite: lp:~mkanat/loggerhead/raw-controller
Diff against target: 152 lines (+70/-2)
3 files modified
docs/serve-branches.rst (+18/-1)
loggerhead/apps/transport.py (+46/-1)
loggerhead/config.py (+6/-0)
To merge this branch: bzr merge lp:~mkanat/loggerhead/raw-prefix
Reviewer Review Type Date Requested Status
Martin Pool Approve
Review via email: mp+44682@code.launchpad.net

Commit message

Add a --raw-prefix option for XSS protection on raw content.

Description of the change

This adds XSS protection to the raw-controller branch, by adding an option to serve raw content from a separate branch. I wanted this to be a separate landing so that it could get its own specific review. The docs I added for --raw-prefix should explain everything (and if they don't, then the docs should be updated).

To post a comment you must log in.
Revision history for this message
Martin Pool (mbp) wrote :

That looks good, and it's a nice clean fix.

iirc, there are now 3 options

0- don't serve arbitrary content except as attachments
1- serve it within the standard domain (trust it not to be malicious)
2- serve it from a different domain (with --raw-prefix)

With this and it's dependency merged, what's the default, 0 or 1? I hope it's 0 and you need to opt in to 1.

review: Approve
Revision history for this message
Max Kanat-Alexander (mkanat) wrote :

The options are only 1 and 2, actually. The default is 1. I opted for convenience and discoverability over security in this case, because nearly every loggerhead installation doesn't have any need for security. That is, there's nothing dangerous that an attacker could do with an XSS on a loggerhead installation in most places, because loggerhead is not a container for any sensitive information. It doesn't even have cookies or a login system.

Revision history for this message
Martin Pool (mbp) wrote :

On 5 January 2011 12:39, Max Kanat-Alexander <email address hidden> wrote:
> The options are only 1 and 2, actually. The default is 1. I opted for convenience and discoverability over security in this case, because nearly every loggerhead installation doesn't have any need for security. That is, there's nothing dangerous that an attacker could do with an XSS on a loggerhead installation in most places, because loggerhead is not a container for any sensitive information. It doesn't even have cookies or a login system.

Fair enough.

--
Martin

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'docs/serve-branches.rst'
2--- docs/serve-branches.rst 2010-03-25 10:32:47 +0000
3+++ docs/serve-branches.rst 2010-12-24 22:52:57 +0000
4@@ -72,7 +72,24 @@
5
6 .. cmdoption:: --use-cdn
7
8- Serve YUI javascript libraries from Yahoo!'s CDN.
9+ Serve YUI javascript libraries from Yahoo!'s CDN.\
10+
11+.. cmdoption:: --raw-prefix=RAW_PREFIX
12+
13+ Serve raw files from a separate URL than the normal URL
14+ that loggerhead runs on. This URL should have a different domain name
15+ than your loggerhead installation normally has, but should resolve
16+ to this same loggerhead installation. This is an important security
17+ protection that you should use if you can, as it will protect you against
18+ Cross-Site Scripting attacks from malicious content in repositories.
19+
20+ If you put %branch% into the URL, it will be replaced with the full
21+ path to the current branch, replacing every / with _. This allows you
22+ to have separate domain names for each branch, enhancing the protection
23+ provided by this option.
24+
25+ Unike ``--prefix``, this is a full URL. If you also use ``--prefix``, that
26+ value should not be included at the end of this URL.
27
28 .. cmdoption:: --allow-writes
29
30
31=== modified file 'loggerhead/apps/transport.py'
32--- loggerhead/apps/transport.py 2010-05-14 07:53:31 +0000
33+++ loggerhead/apps/transport.py 2010-12-24 22:52:57 +0000
34@@ -16,6 +16,7 @@
35 #
36 """Serve branches at urls that mimic a transport's file system layout."""
37
38+import string
39 import threading
40
41 from bzrlib import branch, errors, urlutils
42@@ -23,7 +24,7 @@
43 from bzrlib.transport import get_transport
44 from bzrlib.transport.http import wsgi
45
46-from paste.request import path_info_pop
47+from paste.request import path_info_pop, construct_url
48 from paste import httpexceptions
49 from paste import urlparser
50
51@@ -49,6 +50,7 @@
52 self._config = root._config
53
54 def app_for_branch(self, branch):
55+ self.check_raw_prefix(True)
56 if not self.name:
57 name = branch._get_nick(local=True)
58 is_root = True
59@@ -74,6 +76,7 @@
60 name = self.name
61 else:
62 name = '/'
63+ self.check_raw_prefix(False)
64 return DirectoryUI(
65 environ['loggerhead.static.url'], self.transport, name)
66 else:
67@@ -85,6 +88,7 @@
68 return BranchesFromTransportServer(new_transport, self.root, new_name)
69
70 def app_for_bazaar_data(self, relpath):
71+ self.check_raw_prefix(False)
72 if relpath == '/.bzr/smart':
73 root_transport = get_transport_for_thread(self.root.base)
74 wsgi_app = wsgi.SmartWSGIApp(root_transport)
75@@ -100,6 +104,43 @@
76 else:
77 return urlparser.make_static(None, path)
78
79+ def check_raw_prefix(self, is_branch):
80+ raw_prefix = self._config.get_option('raw_prefix');
81+ if raw_prefix is None:
82+ return
83+ environ = self._environ
84+
85+ # Replace %branch% with an escaped version of the branch name.
86+ if '%branch%' in raw_prefix:
87+ branch = environ['SCRIPT_NAME']
88+ branch = branch.strip('/')
89+ # Only allow numbers and letters in domain names. Convert
90+ # everything else to an underscore.
91+ domain_chars = string.ascii_letters + string.digits
92+ branch = ['_' if i not in domain_chars else i for i in branch]
93+ branch = ''.join(branch)
94+ raw_prefix = raw_prefix.replace('%branch%', branch)
95+
96+ base_url = construct_url(environ, with_path_info=False,
97+ with_query_string=False)
98+ path = environ['PATH_INFO']
99+
100+ # The /raw controller may only be served over raw_prefix.
101+ if is_branch and path.startswith('/raw'):
102+ if not base_url.startswith(raw_prefix):
103+ url_end = environ.get('SCRIPT_NAME','') \
104+ + environ.get('PATH_INFO', '')
105+ if environ.get('QUERY_STRING'):
106+ url_end += '?' + environ['QUERY_STRING']
107+ url_to = raw_prefix + url_end
108+ raise httpexceptions.HTTPFound(url_to)
109+ # No other URLs may be served over raw_prefix (besides
110+ # static URLs) because they could associate session cookies
111+ # or important authentication info with raw domain, and
112+ # we don't want to allow that.
113+ elif base_url.startswith(raw_prefix):
114+ raise httpexceptions.HTTPNotFound()
115+
116 def check_serveable(self, config):
117 value = config.get_user_option('http_serve')
118 if value is None:
119@@ -108,6 +149,10 @@
120 raise httpexceptions.HTTPNotFound()
121
122 def __call__(self, environ, start_response):
123+ # check_raw_prefix needs this, and it's simpler to set it here
124+ # then pass it around everywhere it's needed. check_raw_prefix
125+ # is only called after this point.
126+ self._environ = environ
127 path = environ['PATH_INFO']
128 try:
129 b = branch.Branch.open_from_transport(self.transport)
130
131=== modified file 'loggerhead/config.py'
132--- loggerhead/config.py 2010-11-29 10:07:55 +0000
133+++ loggerhead/config.py 2010-12-24 22:52:57 +0000
134@@ -71,6 +71,9 @@
135 help="Print the software version and exit")
136 parser.add_option("--use-cdn", action="store_true", dest="use_cdn",
137 help="Serve YUI from Yahoo!'s CDN")
138+ parser.add_option("--raw-prefix", action='callback',
139+ callback=_fix_raw_prefix, type="string",
140+ help="Serve raw files from this different base URL.")
141 parser.add_option("--cache-dir", dest="sql_dir",
142 help="The directory to place the SQL cache in")
143 parser.add_option("--allow-writes", action="store_true",
144@@ -86,6 +89,9 @@
145 'critical': 50,
146 }
147
148+def _fix_raw_prefix(option, opt_str, value, parser):
149+ parser.values.raw_prefix = value.rstrip('/')
150+
151 def _optparse_level_to_int_level(option, opt_str, value, parser):
152 parser.values.log_level = _level_to_int_level(value)
153

Subscribers

People subscribed via source and target branches