Merge lp:~mkanat/loggerhead/raw-prefix into lp:loggerhead

Proposed by Max Kanat-Alexander
Status: Merged
Approved by: Max Kanat-Alexander
Approved revision: 442
Merged at revision: 440
Proposed branch: lp:~mkanat/loggerhead/raw-prefix
Merge into: lp:loggerhead
Prerequisite: lp:~mkanat/loggerhead/raw-controller
Diff against target: 152 lines (+70/-2)
3 files modified
docs/serve-branches.rst (+18/-1)
loggerhead/apps/transport.py (+46/-1)
loggerhead/config.py (+6/-0)
To merge this branch: bzr merge lp:~mkanat/loggerhead/raw-prefix
Reviewer Review Type Date Requested Status
Martin Pool Approve
Review via email: mp+44682@code.launchpad.net

Commit message

Add a --raw-prefix option for XSS protection on raw content.

Description of the change

This adds XSS protection to the raw-controller branch, by adding an option to serve raw content from a separate branch. I wanted this to be a separate landing so that it could get its own specific review. The docs I added for --raw-prefix should explain everything (and if they don't, then the docs should be updated).

To post a comment you must log in.
Revision history for this message
Martin Pool (mbp) wrote :

That looks good, and it's a nice clean fix.

iirc, there are now 3 options

0- don't serve arbitrary content except as attachments
1- serve it within the standard domain (trust it not to be malicious)
2- serve it from a different domain (with --raw-prefix)

With this and it's dependency merged, what's the default, 0 or 1? I hope it's 0 and you need to opt in to 1.

review: Approve
Revision history for this message
Max Kanat-Alexander (mkanat) wrote :

The options are only 1 and 2, actually. The default is 1. I opted for convenience and discoverability over security in this case, because nearly every loggerhead installation doesn't have any need for security. That is, there's nothing dangerous that an attacker could do with an XSS on a loggerhead installation in most places, because loggerhead is not a container for any sensitive information. It doesn't even have cookies or a login system.

Revision history for this message
Martin Pool (mbp) wrote :

On 5 January 2011 12:39, Max Kanat-Alexander <email address hidden> wrote:
> The options are only 1 and 2, actually. The default is 1. I opted for convenience and discoverability over security in this case, because nearly every loggerhead installation doesn't have any need for security. That is, there's nothing dangerous that an attacker could do with an XSS on a loggerhead installation in most places, because loggerhead is not a container for any sensitive information. It doesn't even have cookies or a login system.

Fair enough.

--
Martin

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'docs/serve-branches.rst'
--- docs/serve-branches.rst 2010-03-25 10:32:47 +0000
+++ docs/serve-branches.rst 2010-12-24 22:52:57 +0000
@@ -72,7 +72,24 @@
7272
73.. cmdoption:: --use-cdn73.. cmdoption:: --use-cdn
74 74
75 Serve YUI javascript libraries from Yahoo!'s CDN.75 Serve YUI javascript libraries from Yahoo!'s CDN.\
76
77.. cmdoption:: --raw-prefix=RAW_PREFIX
78
79 Serve raw files from a separate URL than the normal URL
80 that loggerhead runs on. This URL should have a different domain name
81 than your loggerhead installation normally has, but should resolve
82 to this same loggerhead installation. This is an important security
83 protection that you should use if you can, as it will protect you against
84 Cross-Site Scripting attacks from malicious content in repositories.
85
86 If you put %branch% into the URL, it will be replaced with the full
87 path to the current branch, replacing every / with _. This allows you
88 to have separate domain names for each branch, enhancing the protection
89 provided by this option.
90
91 Unike ``--prefix``, this is a full URL. If you also use ``--prefix``, that
92 value should not be included at the end of this URL.
7693
77.. cmdoption:: --allow-writes94.. cmdoption:: --allow-writes
78 95
7996
=== modified file 'loggerhead/apps/transport.py'
--- loggerhead/apps/transport.py 2010-05-14 07:53:31 +0000
+++ loggerhead/apps/transport.py 2010-12-24 22:52:57 +0000
@@ -16,6 +16,7 @@
16#16#
17"""Serve branches at urls that mimic a transport's file system layout."""17"""Serve branches at urls that mimic a transport's file system layout."""
1818
19import string
19import threading20import threading
2021
21from bzrlib import branch, errors, urlutils22from bzrlib import branch, errors, urlutils
@@ -23,7 +24,7 @@
23from bzrlib.transport import get_transport24from bzrlib.transport import get_transport
24from bzrlib.transport.http import wsgi25from bzrlib.transport.http import wsgi
2526
26from paste.request import path_info_pop27from paste.request import path_info_pop, construct_url
27from paste import httpexceptions28from paste import httpexceptions
28from paste import urlparser29from paste import urlparser
2930
@@ -49,6 +50,7 @@
49 self._config = root._config50 self._config = root._config
5051
51 def app_for_branch(self, branch):52 def app_for_branch(self, branch):
53 self.check_raw_prefix(True)
52 if not self.name:54 if not self.name:
53 name = branch._get_nick(local=True)55 name = branch._get_nick(local=True)
54 is_root = True56 is_root = True
@@ -74,6 +76,7 @@
74 name = self.name76 name = self.name
75 else:77 else:
76 name = '/'78 name = '/'
79 self.check_raw_prefix(False)
77 return DirectoryUI(80 return DirectoryUI(
78 environ['loggerhead.static.url'], self.transport, name)81 environ['loggerhead.static.url'], self.transport, name)
79 else:82 else:
@@ -85,6 +88,7 @@
85 return BranchesFromTransportServer(new_transport, self.root, new_name)88 return BranchesFromTransportServer(new_transport, self.root, new_name)
8689
87 def app_for_bazaar_data(self, relpath):90 def app_for_bazaar_data(self, relpath):
91 self.check_raw_prefix(False)
88 if relpath == '/.bzr/smart':92 if relpath == '/.bzr/smart':
89 root_transport = get_transport_for_thread(self.root.base)93 root_transport = get_transport_for_thread(self.root.base)
90 wsgi_app = wsgi.SmartWSGIApp(root_transport)94 wsgi_app = wsgi.SmartWSGIApp(root_transport)
@@ -100,6 +104,43 @@
100 else:104 else:
101 return urlparser.make_static(None, path)105 return urlparser.make_static(None, path)
102106
107 def check_raw_prefix(self, is_branch):
108 raw_prefix = self._config.get_option('raw_prefix');
109 if raw_prefix is None:
110 return
111 environ = self._environ
112
113 # Replace %branch% with an escaped version of the branch name.
114 if '%branch%' in raw_prefix:
115 branch = environ['SCRIPT_NAME']
116 branch = branch.strip('/')
117 # Only allow numbers and letters in domain names. Convert
118 # everything else to an underscore.
119 domain_chars = string.ascii_letters + string.digits
120 branch = ['_' if i not in domain_chars else i for i in branch]
121 branch = ''.join(branch)
122 raw_prefix = raw_prefix.replace('%branch%', branch)
123
124 base_url = construct_url(environ, with_path_info=False,
125 with_query_string=False)
126 path = environ['PATH_INFO']
127
128 # The /raw controller may only be served over raw_prefix.
129 if is_branch and path.startswith('/raw'):
130 if not base_url.startswith(raw_prefix):
131 url_end = environ.get('SCRIPT_NAME','') \
132 + environ.get('PATH_INFO', '')
133 if environ.get('QUERY_STRING'):
134 url_end += '?' + environ['QUERY_STRING']
135 url_to = raw_prefix + url_end
136 raise httpexceptions.HTTPFound(url_to)
137 # No other URLs may be served over raw_prefix (besides
138 # static URLs) because they could associate session cookies
139 # or important authentication info with raw domain, and
140 # we don't want to allow that.
141 elif base_url.startswith(raw_prefix):
142 raise httpexceptions.HTTPNotFound()
143
103 def check_serveable(self, config):144 def check_serveable(self, config):
104 value = config.get_user_option('http_serve')145 value = config.get_user_option('http_serve')
105 if value is None:146 if value is None:
@@ -108,6 +149,10 @@
108 raise httpexceptions.HTTPNotFound()149 raise httpexceptions.HTTPNotFound()
109150
110 def __call__(self, environ, start_response):151 def __call__(self, environ, start_response):
152 # check_raw_prefix needs this, and it's simpler to set it here
153 # then pass it around everywhere it's needed. check_raw_prefix
154 # is only called after this point.
155 self._environ = environ
111 path = environ['PATH_INFO']156 path = environ['PATH_INFO']
112 try:157 try:
113 b = branch.Branch.open_from_transport(self.transport)158 b = branch.Branch.open_from_transport(self.transport)
114159
=== modified file 'loggerhead/config.py'
--- loggerhead/config.py 2010-11-29 10:07:55 +0000
+++ loggerhead/config.py 2010-12-24 22:52:57 +0000
@@ -71,6 +71,9 @@
71 help="Print the software version and exit")71 help="Print the software version and exit")
72 parser.add_option("--use-cdn", action="store_true", dest="use_cdn",72 parser.add_option("--use-cdn", action="store_true", dest="use_cdn",
73 help="Serve YUI from Yahoo!'s CDN")73 help="Serve YUI from Yahoo!'s CDN")
74 parser.add_option("--raw-prefix", action='callback',
75 callback=_fix_raw_prefix, type="string",
76 help="Serve raw files from this different base URL.")
74 parser.add_option("--cache-dir", dest="sql_dir",77 parser.add_option("--cache-dir", dest="sql_dir",
75 help="The directory to place the SQL cache in")78 help="The directory to place the SQL cache in")
76 parser.add_option("--allow-writes", action="store_true",79 parser.add_option("--allow-writes", action="store_true",
@@ -86,6 +89,9 @@
86 'critical': 50,89 'critical': 50,
87}90}
8891
92def _fix_raw_prefix(option, opt_str, value, parser):
93 parser.values.raw_prefix = value.rstrip('/')
94
89def _optparse_level_to_int_level(option, opt_str, value, parser):95def _optparse_level_to_int_level(option, opt_str, value, parser):
90 parser.values.log_level = _level_to_int_level(value)96 parser.values.log_level = _level_to_int_level(value)
9197

Subscribers

People subscribed via source and target branches