Merge lp:~notmyname/swift/stats_system into lp:~hudson-openstack/swift/trunk
- stats_system
- Merge into trunk
Status: | Superseded |
---|---|
Proposed branch: | lp:~notmyname/swift/stats_system |
Merge into: | lp:~hudson-openstack/swift/trunk |
Diff against target: |
677 lines (+641/-0) 7 files modified
bin/swift-account-stats-logger.py (+81/-0) bin/swift-log-uploader (+83/-0) etc/log-processing.conf-sample (+27/-0) swift/common/compressed_file_reader.py (+72/-0) swift/common/internal_proxy.py (+174/-0) swift/stats/account_stats.py (+69/-0) swift/stats/log_uploader.py (+135/-0) |
To merge this branch: | bzr merge lp:~notmyname/swift/stats_system |
Related bugs: | |
Related blueprints: |
Stats system
(High)
|
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Swift Core security contacts | Pending | ||
Review via email: mp+31875@code.launchpad.net |
This proposal has been superseded by a proposal from 2010-08-06.
Commit message
Description of the change
log_uploader and a few supporting libraries as the first part of the stats system
- 50. By John Dickinson
-
added account stats logger to stats system
- 51. By John Dickinson
-
added log_processor and a stats plugin
- 52. By John Dickinson
-
merged with trunk
- 53. By John Dickinson
-
added access log processing plugin
- 54. By John Dickinson
-
merged with trunk
- 55. By John Dickinson
-
merged with trunk
- 56. By John Dickinson
-
merged with trunk
- 57. By John Dickinson
-
merged with trunk
- 58. By John Dickinson
-
merged with trunk
- 59. By John Dickinson
-
merged with trunk
- 60. By John Dickinson
-
initial tests for the stats system
- 61. By John Dickinson
-
first test working
- 62. By John Dickinson
-
merged with trunk
- 63. By John Dickinson
-
access log parsing tests pass
- 64. By John Dickinson
-
added (working) stats tests
- 65. By John Dickinson
-
merged with trunk (utils fix)
- 66. By John Dickinson
-
updated stats binaries to be DRY compliant
- 67. By John Dickinson
-
merged with trunk
- 68. By John Dickinson
-
set up log-stats-collector as a daemon process to create csv files
- 69. By John Dickinson
-
merged with trunk
- 70. By John Dickinson
-
added execute perms to stats processor binaries
- 71. By John Dickinson
-
updated config file loading to work with paste.deploy configs
- 72. By John Dickinson
-
fixed typos
- 73. By John Dickinson
-
fixed internal proxy loading
- 74. By John Dickinson
-
made a memcache stub for the internal proxy server
- 75. By John Dickinson
-
fixed internal proxy put_container reference
- 76. By John Dickinson
-
fixed some log uploading glob patterns
- 77. By John Dickinson
-
fixed bug in calculating offsets for filename patterns
- 78. By John Dickinson
-
fixed typos in log processor
- 79. By John Dickinson
-
fixed get_data_list in log_processor
- 80. By John Dickinson
-
added some debug output
- 81. By John Dickinson
-
fixed logging and log uploading
- 82. By John Dickinson
-
fixed copy/paste errors and missing imports
- 83. By John Dickinson
-
added error handling and missing return statement
- 84. By John Dickinson
-
handled some typos and better handling of missing data in internal proxy
- 85. By John Dickinson
-
fixed tests, typos, and added error handling
- 86. By John Dickinson
-
fixed bug in account stats log processing
- 87. By John Dickinson
-
fixed listing filter in log processing
- 88. By John Dickinson
-
fixed stdout capturing for generating csv files
- 89. By John Dickinson
-
fixed replica count reporting error
- 90. By John Dickinson
-
fixed lookback in log processor
- 91. By John Dickinson
-
merged with trunk
- 92. By John Dickinson
-
merged with trunk
- 93. By John Dickinson
-
fixed tests to account for changed key name
- 94. By John Dickinson
-
fixed test bug
- 95. By John Dickinson
-
updated with changes and suggestions from code review
- 96. By John Dickinson
-
merged with changes from trunk
- 97. By John Dickinson
-
added stats overview
- 98. By John Dickinson
-
updated with changes from trunk
- 99. By John Dickinson
-
added additional docs
- 100. By John Dickinson
-
documentation clarification and pep8 fixes
- 101. By John Dickinson
-
added overview stats to the doc index
- 102. By John Dickinson
-
made long lines wrap (grr pep8)
- 103. By John Dickinson
-
merged with trunk
- 104. By John Dickinson
-
merged with trunk
- 105. By John Dickinson
-
merged with trunk
- 106. By John Dickinson
-
fixed stats docs
- 107. By John Dickinson
-
added a bad lines check to the access log parser
- 108. By John Dickinson
-
added tests for compressing file reader
- 109. By John Dickinson
-
fixed compressing file reader test
- 110. By John Dickinson
-
fixed compressing file reader test
- 111. By John Dickinson
-
improved compressing file reader test
- 112. By John Dickinson
-
fixed compressing file reader test
- 113. By John Dickinson
-
made try/except much less inclusive in access log processor
- 114. By John Dickinson
-
added keylist mapping tests and fixed other tests
- 115. By John Dickinson
-
updated stats system tests
- 116. By John Dickinson
-
merged with gholt's test stubs
- 117. By John Dickinson
-
updated SAIO instructions for the stats system
- 118. By John Dickinson
-
fixed stats system saio docs
- 119. By John Dickinson
-
fixed stats system saio docs
- 120. By John Dickinson
-
added openstack copyright/license to test_log_
processor. py - 121. By John Dickinson
-
improved logging in log processors
- 122. By John Dickinson
-
merged with trunk
- 123. By John Dickinson
-
fixed missing working directory bug in account stats
- 124. By John Dickinson
-
fixed readconf parameter that was broken with a previous merge
- 125. By John Dickinson
-
updated setup.py and saio docs for syats system
- 126. By John Dickinson
-
added readconf unit test
- 127. By John Dickinson
-
updated stats saio docs to create logs with the appropriate permissions
- 128. By John Dickinson
-
fixed bug in log processor internal proxy lazy load code
- 129. By John Dickinson
-
updated readconf test
- 130. By John Dickinson
-
updated readconf test
- 131. By John Dickinson
-
updated readconf test
- 132. By John Dickinson
-
fixed internal proxy references in log processor
- 133. By John Dickinson
-
fixed account stats filename creation
- 134. By John Dickinson
-
pep8 tomfoolery
- 135. By John Dickinson
-
moved paren
- 136. By John Dickinson
-
added lazy load of internal proxy to log processor (you were right clay)
Unmerged revisions
Preview Diff
1 | === added file 'bin/swift-account-stats-logger.py' | |||
2 | --- bin/swift-account-stats-logger.py 1970-01-01 00:00:00 +0000 | |||
3 | +++ bin/swift-account-stats-logger.py 2010-08-06 04:11:39 +0000 | |||
4 | @@ -0,0 +1,81 @@ | |||
5 | 1 | #!/usr/bin/python | ||
6 | 2 | # Copyright (c) 2010 OpenStack, LLC. | ||
7 | 3 | # | ||
8 | 4 | # Licensed under the Apache License, Version 2.0 (the "License"); | ||
9 | 5 | # you may not use this file except in compliance with the License. | ||
10 | 6 | # You may obtain a copy of the License at | ||
11 | 7 | # | ||
12 | 8 | # http://www.apache.org/licenses/LICENSE-2.0 | ||
13 | 9 | # | ||
14 | 10 | # Unless required by applicable law or agreed to in writing, software | ||
15 | 11 | # distributed under the License is distributed on an "AS IS" BASIS, | ||
16 | 12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or | ||
17 | 13 | # implied. | ||
18 | 14 | # See the License for the specific language governing permissions and | ||
19 | 15 | # limitations under the License. | ||
20 | 16 | |||
21 | 17 | import os | ||
22 | 18 | import signal | ||
23 | 19 | import sys | ||
24 | 20 | import time | ||
25 | 21 | from ConfigParser import ConfigParser | ||
26 | 22 | |||
27 | 23 | from swift.account_stats import AccountStat | ||
28 | 24 | from swift.common import utils | ||
29 | 25 | |||
30 | 26 | if __name__ == '__main__': | ||
31 | 27 | if len(sys.argv) < 2: | ||
32 | 28 | print "Usage: swift-account-stats-logger CONFIG_FILE" | ||
33 | 29 | sys.exit() | ||
34 | 30 | |||
35 | 31 | c = ConfigParser() | ||
36 | 32 | if not c.read(sys.argv[1]): | ||
37 | 33 | print "Unable to read config file." | ||
38 | 34 | sys.exit(1) | ||
39 | 35 | |||
40 | 36 | if c.has_section('log-processor-stats'): | ||
41 | 37 | stats_conf = dict(c.items('log-processor-stats')) | ||
42 | 38 | else: | ||
43 | 39 | print "Unable to find log-processor-stats config section in %s." % \ | ||
44 | 40 | sys.argv[1] | ||
45 | 41 | sys.exit(1) | ||
46 | 42 | |||
47 | 43 | # reference this from the account stats conf | ||
48 | 44 | |||
49 | 45 | target_dir = stats.conf.get('log_dir', '/var/log/swift') | ||
50 | 46 | account_server_conf_loc = stats_conf.get('account_server_conf', | ||
51 | 47 | '/etc/swift/account-server.conf') | ||
52 | 48 | filename_format = stats.conf['source_filename_format'] | ||
53 | 49 | try: | ||
54 | 50 | c = ConfigParser() | ||
55 | 51 | c.read(account_server_conf_loc) | ||
56 | 52 | account_server_conf = dict(c.items('account-server')) | ||
57 | 53 | except: | ||
58 | 54 | print "Unable to load account server conf from %s" % account_server_conf_loc | ||
59 | 55 | sys.exit(1) | ||
60 | 56 | |||
61 | 57 | utils.drop_privileges(account_server_conf.get('user', 'swift')) | ||
62 | 58 | |||
63 | 59 | try: | ||
64 | 60 | os.setsid() | ||
65 | 61 | except OSError: | ||
66 | 62 | pass | ||
67 | 63 | |||
68 | 64 | logger = utils.get_logger(stats_conf, 'swift-account-stats-logger') | ||
69 | 65 | |||
70 | 66 | def kill_children(*args): | ||
71 | 67 | signal.signal(signal.SIGTERM, signal.SIG_IGN) | ||
72 | 68 | os.killpg(0, signal.SIGTERM) | ||
73 | 69 | sys.exit() | ||
74 | 70 | |||
75 | 71 | signal.signal(signal.SIGTERM, kill_children) | ||
76 | 72 | |||
77 | 73 | stats = AccountStat(filename_format, | ||
78 | 74 | target_dir, | ||
79 | 75 | account_server_conf, | ||
80 | 76 | logger) | ||
81 | 77 | logger.info("Gathering account stats") | ||
82 | 78 | start = time.time() | ||
83 | 79 | stats.find_and_process() | ||
84 | 80 | logger.info("Gathering account stats complete (%0.2f minutes)" % | ||
85 | 81 | ((time.time()-start)/60)) | ||
86 | 0 | 82 | ||
87 | === added file 'bin/swift-log-uploader' | |||
88 | --- bin/swift-log-uploader 1970-01-01 00:00:00 +0000 | |||
89 | +++ bin/swift-log-uploader 2010-08-06 04:11:39 +0000 | |||
90 | @@ -0,0 +1,83 @@ | |||
91 | 1 | #!/usr/bin/python | ||
92 | 2 | # Copyright (c) 2010 OpenStack, LLC. | ||
93 | 3 | # | ||
94 | 4 | # Licensed under the Apache License, Version 2.0 (the "License"); | ||
95 | 5 | # you may not use this file except in compliance with the License. | ||
96 | 6 | # You may obtain a copy of the License at | ||
97 | 7 | # | ||
98 | 8 | # http://www.apache.org/licenses/LICENSE-2.0 | ||
99 | 9 | # | ||
100 | 10 | # Unless required by applicable law or agreed to in writing, software | ||
101 | 11 | # distributed under the License is distributed on an "AS IS" BASIS, | ||
102 | 12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or | ||
103 | 13 | # implied. | ||
104 | 14 | # See the License for the specific language governing permissions and | ||
105 | 15 | # limitations under the License. | ||
106 | 16 | |||
107 | 17 | import os | ||
108 | 18 | import signal | ||
109 | 19 | import sys | ||
110 | 20 | import time | ||
111 | 21 | from ConfigParser import ConfigParser | ||
112 | 22 | |||
113 | 23 | from swift.stats.log_uploader import LogUploader | ||
114 | 24 | from swift.common.utils import get_logger | ||
115 | 25 | |||
116 | 26 | if __name__ == '__main__': | ||
117 | 27 | if len(sys.argv) < 3: | ||
118 | 28 | print "Usage: swift-log-uploader CONFIG_FILE plugin" | ||
119 | 29 | sys.exit() | ||
120 | 30 | |||
121 | 31 | c = ConfigParser() | ||
122 | 32 | if not c.read(sys.argv[1]): | ||
123 | 33 | print "Unable to read config file." | ||
124 | 34 | sys.exit(1) | ||
125 | 35 | |||
126 | 36 | if c.has_section('log-processor'): | ||
127 | 37 | parser_conf = dict(c.items('log-processor')) | ||
128 | 38 | else: | ||
129 | 39 | print "Unable to find log-processor config section in %s." % sys.argv[1] | ||
130 | 40 | sys.exit(1) | ||
131 | 41 | |||
132 | 42 | plugin = sys.argv[2] | ||
133 | 43 | section_name = 'log-processor-%s' % plugin | ||
134 | 44 | if c.has_section(section_name): | ||
135 | 45 | uploader_conf.update(dict(c.items(section_name))) | ||
136 | 46 | else: | ||
137 | 47 | print "Unable to find %s config section in %s." % (section_name, | ||
138 | 48 | sys.argv[1]) | ||
139 | 49 | sys.exit(1) | ||
140 | 50 | |||
141 | 51 | try: | ||
142 | 52 | os.setsid() | ||
143 | 53 | except OSError: | ||
144 | 54 | pass | ||
145 | 55 | |||
146 | 56 | logger = get_logger(uploader_conf, 'swift-log-uploader') | ||
147 | 57 | |||
148 | 58 | def kill_children(*args): | ||
149 | 59 | signal.signal(signal.SIGTERM, signal.SIG_IGN) | ||
150 | 60 | os.killpg(0, signal.SIGTERM) | ||
151 | 61 | sys.exit() | ||
152 | 62 | |||
153 | 63 | signal.signal(signal.SIGTERM, kill_children) | ||
154 | 64 | |||
155 | 65 | log_dir = uploader_conf.get('log_dir', '/var/log/swift/') | ||
156 | 66 | swift_account = uploader_conf['swift_account'] | ||
157 | 67 | container_name = uploader_conf['container_name'] | ||
158 | 68 | source_filename_format = uploader_conf['source_filename_format'] | ||
159 | 69 | proxy_server_conf_loc = uploader_conf.get('proxy_server_conf', | ||
160 | 70 | '/etc/swift/proxy-server.conf') | ||
161 | 71 | try: | ||
162 | 72 | c = ConfigParser() | ||
163 | 73 | c.read(proxy_server_conf_loc) | ||
164 | 74 | proxy_server_conf = dict(c.items('proxy-server')) | ||
165 | 75 | except: | ||
166 | 76 | proxy_server_conf = None | ||
167 | 77 | uploader = LogUploader(log_dir, swift_account, container_name, | ||
168 | 78 | source_filename_format, proxy_server_conf, logger) | ||
169 | 79 | logger.info("Uploading logs") | ||
170 | 80 | start = time.time() | ||
171 | 81 | uploader.upload_all_logs() | ||
172 | 82 | logger.info("Uploading logs complete (%0.2f minutes)" % | ||
173 | 83 | ((time.time()-start)/60)) | ||
174 | 0 | 84 | ||
175 | === added file 'etc/log-processing.conf-sample' | |||
176 | --- etc/log-processing.conf-sample 1970-01-01 00:00:00 +0000 | |||
177 | +++ etc/log-processing.conf-sample 2010-08-06 04:11:39 +0000 | |||
178 | @@ -0,0 +1,27 @@ | |||
179 | 1 | # plugin section format is named "log-processor-<plugin>" | ||
180 | 2 | # section "log-processor" is the generic defaults (overridden by plugins) | ||
181 | 3 | |||
182 | 4 | [log-processor] | ||
183 | 5 | # working_dir = /tmp/swift/ | ||
184 | 6 | # proxy_server_conf = /etc/swift/proxy-server.conf | ||
185 | 7 | # log_facility = LOG_LOCAL0 | ||
186 | 8 | # log_level = INFO | ||
187 | 9 | # lookback_hours = 120 | ||
188 | 10 | # lookback_window = 120 | ||
189 | 11 | |||
190 | 12 | [log-processor-access] | ||
191 | 13 | # log_dir = /var/log/swift/ | ||
192 | 14 | swift_account = AUTH_7abbc116-8a07-4b63-819d-02715d3e0f31 | ||
193 | 15 | container_name = log_data | ||
194 | 16 | source_filename_format = %Y%m%d%H* | ||
195 | 17 | class_path = swift.stats.access_processor | ||
196 | 18 | # service ips is for client ip addresses that should be counted as servicenet | ||
197 | 19 | # service_ips = | ||
198 | 20 | |||
199 | 21 | [log-processor-stats] | ||
200 | 22 | # log_dir = /var/log/swift/ | ||
201 | 23 | swift_account = AUTH_7abbc116-8a07-4b63-819d-02715d3e0f31 | ||
202 | 24 | container_name = account_stats | ||
203 | 25 | source_filename_format = %Y%m%d%H* | ||
204 | 26 | class_path = swift.stats.stats_processor | ||
205 | 27 | # account_server_conf = /etc/swift/account-server.conf | ||
206 | 0 | 28 | ||
207 | === added file 'swift/common/compressed_file_reader.py' | |||
208 | --- swift/common/compressed_file_reader.py 1970-01-01 00:00:00 +0000 | |||
209 | +++ swift/common/compressed_file_reader.py 2010-08-06 04:11:39 +0000 | |||
210 | @@ -0,0 +1,72 @@ | |||
211 | 1 | # Copyright (c) 2010 OpenStack, LLC. | ||
212 | 2 | # | ||
213 | 3 | # Licensed under the Apache License, Version 2.0 (the "License"); | ||
214 | 4 | # you may not use this file except in compliance with the License. | ||
215 | 5 | # You may obtain a copy of the License at | ||
216 | 6 | # | ||
217 | 7 | # http://www.apache.org/licenses/LICENSE-2.0 | ||
218 | 8 | # | ||
219 | 9 | # Unless required by applicable law or agreed to in writing, software | ||
220 | 10 | # distributed under the License is distributed on an "AS IS" BASIS, | ||
221 | 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or | ||
222 | 12 | # implied. | ||
223 | 13 | # See the License for the specific language governing permissions and | ||
224 | 14 | # limitations under the License. | ||
225 | 15 | |||
226 | 16 | import zlib | ||
227 | 17 | import struct | ||
228 | 18 | |||
229 | 19 | |||
230 | 20 | class CompressedFileReader(object): | ||
231 | 21 | ''' | ||
232 | 22 | Wraps a file object and provides a read method that returns gzip'd data. | ||
233 | 23 | |||
234 | 24 | One warning: if read is called with a small value, the data returned may | ||
235 | 25 | be bigger than the value. In this case, the "compressed" data will be | ||
236 | 26 | bigger than the original data. To solve this, use a bigger read buffer. | ||
237 | 27 | |||
238 | 28 | An example use case: | ||
239 | 29 | Given an uncompressed file on disk, provide a way to read compressed data | ||
240 | 30 | without buffering the entire file data in memory. Using this class, an | ||
241 | 31 | uncompressed log file could be uploaded as compressed data with chunked | ||
242 | 32 | transfer encoding. | ||
243 | 33 | |||
244 | 34 | gzip header and footer code taken from the python stdlib gzip module | ||
245 | 35 | |||
246 | 36 | :param file_obj: File object to read from | ||
247 | 37 | :param compresslevel: compression level | ||
248 | 38 | ''' | ||
249 | 39 | def __init__(self, file_obj, compresslevel=9): | ||
250 | 40 | self._f = file_obj | ||
251 | 41 | self._compressor = zlib.compressobj(compresslevel, | ||
252 | 42 | zlib.DEFLATED, | ||
253 | 43 | -zlib.MAX_WBITS, | ||
254 | 44 | zlib.DEF_MEM_LEVEL, | ||
255 | 45 | 0) | ||
256 | 46 | self.done = False | ||
257 | 47 | self.first = True | ||
258 | 48 | self.crc32 = 0 | ||
259 | 49 | self.total_size = 0 | ||
260 | 50 | |||
261 | 51 | def read(self, *a, **kw): | ||
262 | 52 | if self.done: | ||
263 | 53 | return '' | ||
264 | 54 | x = self._f.read(*a, **kw) | ||
265 | 55 | if x: | ||
266 | 56 | self.crc32 = zlib.crc32(x, self.crc32) & 0xffffffffL | ||
267 | 57 | self.total_size += len(x) | ||
268 | 58 | compressed = self._compressor.compress(x) | ||
269 | 59 | if not compressed: | ||
270 | 60 | compressed = self._compressor.flush(zlib.Z_SYNC_FLUSH) | ||
271 | 61 | else: | ||
272 | 62 | compressed = self._compressor.flush(zlib.Z_FINISH) | ||
273 | 63 | crc32 = struct.pack("<L", self.crc32 & 0xffffffffL) | ||
274 | 64 | size = struct.pack("<L", self.total_size & 0xffffffffL) | ||
275 | 65 | footer = crc32 + size | ||
276 | 66 | compressed += footer | ||
277 | 67 | self.done = True | ||
278 | 68 | if self.first: | ||
279 | 69 | self.first = False | ||
280 | 70 | header = '\037\213\010\000\000\000\000\000\002\377' | ||
281 | 71 | compressed = header + compressed | ||
282 | 72 | return compressed | ||
283 | 0 | 73 | ||
284 | === added file 'swift/common/internal_proxy.py' | |||
285 | --- swift/common/internal_proxy.py 1970-01-01 00:00:00 +0000 | |||
286 | +++ swift/common/internal_proxy.py 2010-08-06 04:11:39 +0000 | |||
287 | @@ -0,0 +1,174 @@ | |||
288 | 1 | # Copyright (c) 2010 OpenStack, LLC. | ||
289 | 2 | # | ||
290 | 3 | # Licensed under the Apache License, Version 2.0 (the "License"); | ||
291 | 4 | # you may not use this file except in compliance with the License. | ||
292 | 5 | # You may obtain a copy of the License at | ||
293 | 6 | # | ||
294 | 7 | # http://www.apache.org/licenses/LICENSE-2.0 | ||
295 | 8 | # | ||
296 | 9 | # Unless required by applicable law or agreed to in writing, software | ||
297 | 10 | # distributed under the License is distributed on an "AS IS" BASIS, | ||
298 | 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or | ||
299 | 12 | # implied. | ||
300 | 13 | # See the License for the specific language governing permissions and | ||
301 | 14 | # limitations under the License. | ||
302 | 15 | |||
303 | 16 | import webob | ||
304 | 17 | from urllib import quote, unquote | ||
305 | 18 | from json import loads as json_loads | ||
306 | 19 | |||
307 | 20 | from swift.common.compressed_file_reader import CompressedFileReader | ||
308 | 21 | from swift.proxy.server import BaseApplication | ||
309 | 22 | |||
310 | 23 | |||
311 | 24 | class InternalProxy(object): | ||
312 | 25 | """ | ||
313 | 26 | Set up a private instance of a proxy server that allows normal requests | ||
314 | 27 | to be made without having to actually send the request to the proxy. | ||
315 | 28 | This also doesn't log the requests to the normal proxy logs. | ||
316 | 29 | |||
317 | 30 | :param proxy_server_conf: proxy server configuration dictionary | ||
318 | 31 | :param logger: logger to log requests to | ||
319 | 32 | :param retries: number of times to retry each request | ||
320 | 33 | """ | ||
321 | 34 | def __init__(self, proxy_server_conf=None, logger=None, retries=0): | ||
322 | 35 | self.upload_app = BaseApplication(proxy_server_conf, logger) | ||
323 | 36 | self.retries = retries | ||
324 | 37 | |||
325 | 38 | def upload_file(self, source_file, account, container, object_name, | ||
326 | 39 | compress=True, content_type='application/x-gzip'): | ||
327 | 40 | """ | ||
328 | 41 | Upload a file to cloud files. | ||
329 | 42 | |||
330 | 43 | :param source_file: path to or file like object to upload | ||
331 | 44 | :param account: account to upload to | ||
332 | 45 | :param container: container to upload to | ||
333 | 46 | :param object_name: name of object being uploaded | ||
334 | 47 | :param compress: if True, compresses object as it is uploaded | ||
335 | 48 | :param content_type: content-type of object | ||
336 | 49 | :returns: True if successful, False otherwise | ||
337 | 50 | """ | ||
338 | 51 | log_create_pattern = '/v1/%s/%s/%s' % (account, container, object_name) | ||
339 | 52 | |||
340 | 53 | # create the container | ||
341 | 54 | if not self.put_container(account, container): | ||
342 | 55 | return False | ||
343 | 56 | |||
344 | 57 | # upload the file to the account | ||
345 | 58 | req = webob.Request.blank(log_create_pattern, | ||
346 | 59 | environ={'REQUEST_METHOD': 'PUT'}, | ||
347 | 60 | headers={'Transfer-Encoding': 'chunked'}) | ||
348 | 61 | if compress: | ||
349 | 62 | if hasattr(source_file, 'read'): | ||
350 | 63 | compressed_file = CompressedFileReader(source_file) | ||
351 | 64 | else: | ||
352 | 65 | compressed_file = CompressedFileReader(open(source_file, 'rb')) | ||
353 | 66 | req.body_file = compressed_file | ||
354 | 67 | else: | ||
355 | 68 | if not hasattr(source_file, 'read'): | ||
356 | 69 | source_file = open(source_file, 'rb') | ||
357 | 70 | req.body_file = source_file | ||
358 | 71 | req.account = account | ||
359 | 72 | req.content_type = content_type | ||
360 | 73 | req.content_length = None # to make sure we send chunked data | ||
361 | 74 | resp = self.upload_app.handle_request(self.upload_app.update_request(req)) | ||
362 | 75 | tries = 1 | ||
363 | 76 | while (resp.status_int < 200 or resp.status_int > 299) \ | ||
364 | 77 | and tries <= self.retries: | ||
365 | 78 | resp = self.upload_app.handle_request(self.upload_app.update_request(req)) | ||
366 | 79 | tries += 1 | ||
367 | 80 | if not (200 <= resp.status_int < 300): | ||
368 | 81 | return False | ||
369 | 82 | return True | ||
370 | 83 | |||
371 | 84 | def get_object(self, account, container, object_name): | ||
372 | 85 | """ | ||
373 | 86 | Get object. | ||
374 | 87 | |||
375 | 88 | :param account: account name object is in | ||
376 | 89 | :param container: container name object is in | ||
377 | 90 | :param object_name: name of object to get | ||
378 | 91 | :returns: iterator for object data | ||
379 | 92 | """ | ||
380 | 93 | req = webob.Request.blank('/v1/%s/%s/%s' % | ||
381 | 94 | (account, container, object_name), | ||
382 | 95 | environ={'REQUEST_METHOD': 'GET'}) | ||
383 | 96 | req.account = account | ||
384 | 97 | resp = self.upload_app.handle_request(self.upload_app.update_request(req)) | ||
385 | 98 | tries = 1 | ||
386 | 99 | while (resp.status_int < 200 or resp.status_int > 299) \ | ||
387 | 100 | and tries <= self.retries: | ||
388 | 101 | resp = self.upload_app.handle_request(self.upload_app.update_request(req)) | ||
389 | 102 | tries += 1 | ||
390 | 103 | for x in resp.app_iter: | ||
391 | 104 | yield x | ||
392 | 105 | |||
393 | 106 | def create_container(self, account, container): | ||
394 | 107 | """ | ||
395 | 108 | Create container. | ||
396 | 109 | |||
397 | 110 | :param account: account name to put the container in | ||
398 | 111 | :param container: container name to create | ||
399 | 112 | :returns: True if successful, otherwise False | ||
400 | 113 | """ | ||
401 | 114 | req = webob.Request.blank('/v1/%s/%s' % (account, container), | ||
402 | 115 | environ={'REQUEST_METHOD': 'PUT'}) | ||
403 | 116 | req.account = account | ||
404 | 117 | resp = self.upload_app.handle_request(self.upload_app.update_request(req)) | ||
405 | 118 | tries = 1 | ||
406 | 119 | while (resp.status_int < 200 or resp.status_int > 299) \ | ||
407 | 120 | and tries <= self.retries: | ||
408 | 121 | resp = self.upload_app.handle_request(self.upload_app.update_request(req)) | ||
409 | 122 | tries += 1 | ||
410 | 123 | return 200 <= resp.status_int < 300 | ||
411 | 124 | |||
412 | 125 | def get_container_list(self, account, container, marker=None, limit=None, | ||
413 | 126 | prefix=None, delimiter=None, full_listing=True): | ||
414 | 127 | """ | ||
415 | 128 | Get container listing. | ||
416 | 129 | |||
417 | 130 | :param account: account name for the container | ||
418 | 131 | :param container: container name to get the listing of | ||
419 | 132 | :param marker: marker query | ||
420 | 133 | :param limit: limit to query | ||
421 | 134 | :param prefix: prefix query | ||
422 | 135 | :param delimeter: delimeter for query | ||
423 | 136 | :param full_listing: if True, make enough requests to get all listings | ||
424 | 137 | :returns: list of objects | ||
425 | 138 | """ | ||
426 | 139 | if full_listing: | ||
427 | 140 | rv = [] | ||
428 | 141 | listing = self.get_container_list(account, container, marker, | ||
429 | 142 | limit, prefix, delimiter, full_listing=False) | ||
430 | 143 | while listing: | ||
431 | 144 | rv.extend(listing) | ||
432 | 145 | if not delimiter: | ||
433 | 146 | marker = listing[-1]['name'] | ||
434 | 147 | else: | ||
435 | 148 | marker = listing[-1].get('name', listing[-1].get('subdir')) | ||
436 | 149 | listing = self.get_container_list(account, container, marker, | ||
437 | 150 | limit, prefix, delimiter, full_listing=False) | ||
438 | 151 | return rv | ||
439 | 152 | path = '/v1/%s/%s' % (account, container) | ||
440 | 153 | qs = 'format=json' | ||
441 | 154 | if marker: | ||
442 | 155 | qs += '&marker=%s' % quote(marker) | ||
443 | 156 | if limit: | ||
444 | 157 | qs += '&limit=%d' % limit | ||
445 | 158 | if prefix: | ||
446 | 159 | qs += '&prefix=%s' % quote(prefix) | ||
447 | 160 | if delimiter: | ||
448 | 161 | qs += '&delimiter=%s' % quote(delimiter) | ||
449 | 162 | path += '?%s' % qs | ||
450 | 163 | req = webob.Request.blank(path, environ={'REQUEST_METHOD': 'GET'}) | ||
451 | 164 | req.account = account | ||
452 | 165 | resp = self.upload_app.handle_request(self.upload_app.update_request(req)) | ||
453 | 166 | tries = 1 | ||
454 | 167 | while (resp.status_int < 200 or resp.status_int > 299) \ | ||
455 | 168 | and tries <= self.retries: | ||
456 | 169 | resp = self.upload_app.handle_request(self.upload_app.update_request(req)) | ||
457 | 170 | tries += 1 | ||
458 | 171 | if resp.status_int == 204: | ||
459 | 172 | return [] | ||
460 | 173 | if 200 <= resp.status_int < 300: | ||
461 | 174 | return json_loads(resp.body) | ||
462 | 0 | 175 | ||
463 | === added directory 'swift/stats' | |||
464 | === added file 'swift/stats/__init__.py' | |||
465 | === added file 'swift/stats/account_stats.py' | |||
466 | --- swift/stats/account_stats.py 1970-01-01 00:00:00 +0000 | |||
467 | +++ swift/stats/account_stats.py 2010-08-06 04:11:39 +0000 | |||
468 | @@ -0,0 +1,69 @@ | |||
469 | 1 | # Copyright (c) 2010 OpenStack, LLC. | ||
470 | 2 | # | ||
471 | 3 | # Licensed under the Apache License, Version 2.0 (the "License"); | ||
472 | 4 | # you may not use this file except in compliance with the License. | ||
473 | 5 | # You may obtain a copy of the License at | ||
474 | 6 | # | ||
475 | 7 | # http://www.apache.org/licenses/LICENSE-2.0 | ||
476 | 8 | # | ||
477 | 9 | # Unless required by applicable law or agreed to in writing, software | ||
478 | 10 | # distributed under the License is distributed on an "AS IS" BASIS, | ||
479 | 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or | ||
480 | 12 | # implied. | ||
481 | 13 | # See the License for the specific language governing permissions and | ||
482 | 14 | # limitations under the License. | ||
483 | 15 | |||
484 | 16 | import os | ||
485 | 17 | import time | ||
486 | 18 | |||
487 | 19 | from swift.account.server import DATADIR as account_server_data_dir | ||
488 | 20 | from swift.common.db import AccountBroker | ||
489 | 21 | from swift.common.internal_proxy import InternalProxy | ||
490 | 22 | from swift.common.utils import renamer | ||
491 | 23 | |||
492 | 24 | class AccountStat(object): | ||
493 | 25 | def __init__(self, filename_format, target_dir, server_conf, logger): | ||
494 | 26 | self.filename_format = filename_format | ||
495 | 27 | self.target_dir = target_dir | ||
496 | 28 | self.devices = server_conf.get('devices', '/srv/node') | ||
497 | 29 | self.mount_check = server_conf.get('mount_check', 'true').lower() in \ | ||
498 | 30 | ('true', 't', '1', 'on', 'yes', 'y') | ||
499 | 31 | self.logger = logger | ||
500 | 32 | |||
501 | 33 | def find_and_process(self): | ||
502 | 34 | src_filename = time.strftime(self.filename_format) | ||
503 | 35 | tmp_filename = os.path.join('/tmp', src_filename) | ||
504 | 36 | with open(tmp_filename, 'wb') as statfile: | ||
505 | 37 | #statfile.write('Account Name, Container Count, Object Count, Bytes Used, Created At\n') | ||
506 | 38 | for device in os.listdir(self.devices): | ||
507 | 39 | if self.mount_check and \ | ||
508 | 40 | not os.path.ismount(os.path.join(self.devices, device)): | ||
509 | 41 | self.logger.error("Device %s is not mounted, skipping." % | ||
510 | 42 | device) | ||
511 | 43 | continue | ||
512 | 44 | accounts = os.path.join(self.devices, | ||
513 | 45 | device, | ||
514 | 46 | account_server_data_dir) | ||
515 | 47 | if not os.path.exists(accounts): | ||
516 | 48 | self.logger.debug("Path %s does not exist, skipping." % | ||
517 | 49 | accounts) | ||
518 | 50 | continue | ||
519 | 51 | for root, dirs, files in os.walk(accounts, topdown=False): | ||
520 | 52 | for filename in files: | ||
521 | 53 | if filename.endswith('.db'): | ||
522 | 54 | broker = AccountBroker(os.path.join(root, filename)) | ||
523 | 55 | if not broker.is_deleted(): | ||
524 | 56 | account_name, | ||
525 | 57 | created_at, | ||
526 | 58 | _, _, | ||
527 | 59 | container_count, | ||
528 | 60 | object_count, | ||
529 | 61 | bytes_used, | ||
530 | 62 | _, _ = broker.get_info() | ||
531 | 63 | line_data = '"%s",%d,%d,%d,%s\n' % (account_name, | ||
532 | 64 | container_count, | ||
533 | 65 | object_count, | ||
534 | 66 | bytes_used, | ||
535 | 67 | created_at) | ||
536 | 68 | statfile.write(line_data) | ||
537 | 69 | renamer(tmp_filename, os.path.join(self.target_dir, src_filename)) | ||
538 | 0 | 70 | ||
539 | === added file 'swift/stats/log_uploader.py' | |||
540 | --- swift/stats/log_uploader.py 1970-01-01 00:00:00 +0000 | |||
541 | +++ swift/stats/log_uploader.py 2010-08-06 04:11:39 +0000 | |||
542 | @@ -0,0 +1,135 @@ | |||
543 | 1 | # Copyright (c) 2010 OpenStack, LLC. | ||
544 | 2 | # | ||
545 | 3 | # Licensed under the Apache License, Version 2.0 (the "License"); | ||
546 | 4 | # you may not use this file except in compliance with the License. | ||
547 | 5 | # You may obtain a copy of the License at | ||
548 | 6 | # | ||
549 | 7 | # http://www.apache.org/licenses/LICENSE-2.0 | ||
550 | 8 | # | ||
551 | 9 | # Unless required by applicable law or agreed to in writing, software | ||
552 | 10 | # distributed under the License is distributed on an "AS IS" BASIS, | ||
553 | 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or | ||
554 | 12 | # implied. | ||
555 | 13 | # See the License for the specific language governing permissions and | ||
556 | 14 | # limitations under the License. | ||
557 | 15 | |||
558 | 16 | from __future__ import with_statement | ||
559 | 17 | import os | ||
560 | 18 | import hashlib | ||
561 | 19 | import time | ||
562 | 20 | import gzip | ||
563 | 21 | import glob | ||
564 | 22 | |||
565 | 23 | from swift.common.internal_proxy import InternalProxy | ||
566 | 24 | |||
567 | 25 | class LogUploader(object): | ||
568 | 26 | ''' | ||
569 | 27 | Given a local directory, a swift account, and a container name, LogParser | ||
570 | 28 | will upload all files in the local directory to the given account/container. | ||
571 | 29 | All but the newest files will be uploaded, and the files' md5 sum will be | ||
572 | 30 | computed. The hash is used to prevent duplicate data from being uploaded | ||
573 | 31 | multiple times in different files (ex: log lines). Since the hash is | ||
574 | 32 | computed, it is also used as the uploaded object's etag to ensure data | ||
575 | 33 | integrity. | ||
576 | 34 | |||
577 | 35 | Note that after the file is successfully uploaded, it will be unlinked. | ||
578 | 36 | |||
579 | 37 | The given proxy server config is used to instantiate a proxy server for | ||
580 | 38 | the object uploads. | ||
581 | 39 | ''' | ||
582 | 40 | |||
583 | 41 | def __init__(self, log_dir, swift_account, container_name, filename_format, | ||
584 | 42 | proxy_server_conf, logger): | ||
585 | 43 | if not log_dir.endswith('/'): | ||
586 | 44 | log_dir = log_dir + '/' | ||
587 | 45 | self.log_dir = log_dir | ||
588 | 46 | self.swift_account = swift_account | ||
589 | 47 | self.container_name = container_name | ||
590 | 48 | self.filename_format = filename_format | ||
591 | 49 | self.internal_proxy = InternalProxy(proxy_server_conf, logger) | ||
592 | 50 | self.logger = logger | ||
593 | 51 | |||
594 | 52 | def upload_all_logs(self): | ||
595 | 53 | i = [(c,self.filename_format.index(c)) for c in '%Y %m %d %H'.split()] | ||
596 | 54 | i.sort() | ||
597 | 55 | year_offset = month_offset = day_offset = hour_offset = None | ||
598 | 56 | for c, start in i: | ||
599 | 57 | if c == '%Y': | ||
600 | 58 | year_offset = start, start+4 | ||
601 | 59 | elif c == '%m': | ||
602 | 60 | month_offset = start, start+2 | ||
603 | 61 | elif c == '%d': | ||
604 | 62 | day_offset = start, start+2 | ||
605 | 63 | elif c == '%H': | ||
606 | 64 | hour_offset = start, start+2 | ||
607 | 65 | if not (year_offset and month_offset and day_offset and hour_offset): | ||
608 | 66 | # don't have all the parts, can't upload anything | ||
609 | 67 | return | ||
610 | 68 | glob_pattern = self.filename_format | ||
611 | 69 | glob_pattern = glob_pattern.replace('%Y', '????') | ||
612 | 70 | glob_pattern = glob_pattern.replace('%m', '??') | ||
613 | 71 | glob_pattern = glob_pattern.replace('%d', '??') | ||
614 | 72 | glob_pattern = glob_pattern.replace('%H', '??') | ||
615 | 73 | filelist = glob.iglob(os.path.join(self.log_dir, glob_pattern)) | ||
616 | 74 | current_hour = int(time.strftime('%H')) | ||
617 | 75 | today = int(time.strftime('%Y%m%d')) | ||
618 | 76 | self.internal_proxy.create_container(self.swift_account, | ||
619 | 77 | self.container_name) | ||
620 | 78 | for filename in filelist: | ||
621 | 79 | try: | ||
622 | 80 | # From the filename, we need to derive the year, month, day, | ||
623 | 81 | # and hour for the file. These values are used in the uploaded | ||
624 | 82 | # object's name, so they should be a reasonably accurate | ||
625 | 83 | # representation of the time for which the data in the file was | ||
626 | 84 | # collected. The file's last modified time is not a reliable | ||
627 | 85 | # representation of the data in the file. For example, an old | ||
628 | 86 | # log file (from hour A) may be uploaded or moved into the | ||
629 | 87 | # log_dir in hour Z. The file's modified time will be for hour | ||
630 | 88 | # Z, and therefore the object's name in the system will not | ||
631 | 89 | # represent the data in it. | ||
632 | 90 | # If the filename doesn't match the format, it shouldn't be | ||
633 | 91 | # uploaded. | ||
634 | 92 | year = filename[slice(*year_offset)] | ||
635 | 93 | month = filename[slice(*month_offset)] | ||
636 | 94 | day = filename[slice(*day_offset)] | ||
637 | 95 | hour = filename[slice(*hour_offset)] | ||
638 | 96 | except IndexError: | ||
639 | 97 | # unexpected filename format, move on | ||
640 | 98 | self.logger.error("Unexpected log: %s" % filename) | ||
641 | 99 | continue | ||
642 | 100 | if (time.time() - os.stat(filename).st_mtime) < 7200: | ||
643 | 101 | # don't process very new logs | ||
644 | 102 | self.logger.debug("Skipping log: %s (< 2 hours old)" % filename) | ||
645 | 103 | continue | ||
646 | 104 | self.upload_one_log(filename, year, month, day, hour) | ||
647 | 105 | |||
648 | 106 | def upload_one_log(self, filename, year, month, day, hour): | ||
649 | 107 | if os.path.getsize(filename) == 0: | ||
650 | 108 | self.logger.debug("Log %s is 0 length, skipping" % filename) | ||
651 | 109 | return | ||
652 | 110 | self.logger.debug("Processing log: %s" % filename) | ||
653 | 111 | filehash = hashlib.md5() | ||
654 | 112 | already_compressed = True if filename.endswith('.gz') else False | ||
655 | 113 | opener = gzip.open if already_compressed else open | ||
656 | 114 | f = opener(filename, 'rb') | ||
657 | 115 | try: | ||
658 | 116 | for line in f: | ||
659 | 117 | # filter out bad lines here? | ||
660 | 118 | filehash.update(line) | ||
661 | 119 | finally: | ||
662 | 120 | f.close() | ||
663 | 121 | filehash = filehash.hexdigest() | ||
664 | 122 | # By adding a hash to the filename, we ensure that uploaded files | ||
665 | 123 | # have unique filenames and protect against uploading one file | ||
666 | 124 | # more than one time. By using md5, we get an etag for free. | ||
667 | 125 | target_filename = '/'.join([year, month, day, hour, filehash+'.gz']) | ||
668 | 126 | if self.internal_proxy.upload_file(filename, | ||
669 | 127 | self.swift_account, | ||
670 | 128 | self.container_name, | ||
671 | 129 | target_filename, | ||
672 | 130 | compress=(not already_compressed)): | ||
673 | 131 | self.logger.debug("Uploaded log %s to %s" % | ||
674 | 132 | (filename, target_filename)) | ||
675 | 133 | os.unlink(filename) | ||
676 | 134 | else: | ||
677 | 135 | self.logger.error("ERROR: Upload of log %s failed!" % filename) |