Merge lp:~verterok/ubuntuone-client/fix-987376 into lp:ubuntuone-client
- fix-987376
- Merge into trunk
Status: | Work in progress |
---|---|
Proposed branch: | lp:~verterok/ubuntuone-client/fix-987376 |
Merge into: | lp:ubuntuone-client |
Diff against target: |
250 lines (+76/-39) 2 files modified
tests/syncdaemon/test_tritcask.py (+40/-13) ubuntuone/syncdaemon/tritcask.py (+36/-26) |
To merge this branch: | bzr merge lp:~verterok/ubuntuone-client/fix-987376 |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Facundo Batista (community) | Approve | ||
Manuel de la Peña (community) | Approve | ||
Review via email: mp+103273@code.launchpad.net |
Commit message
Only use mmap in files of size < 2**31, in order to avoid hitting address space limits, and fallback to standard IO.
Description of the change
Only use mmap in files of size < 2**31, in order to avoid hitting address space limits, and fallback to standard IO.
I'll be working on limiting the file growth in a following branch, to fix Bug #987382.
Manuel de la Peña (mandel) wrote : | # |
The following happens on widows:
[ERROR]
Traceback (most recent call last):
File "C:\Users\
cdaemon\
for i, entry in enumerate(
File "C:\Users\
\syncdaemon\
for entry in self._iter_
File "C:\Users\
\syncdaemon\
entry, new_pos = self.read(fd)
File "C:\Users\
\syncdaemon\
data = fd.read(
exceptions.
tests.syncdaemo
=======
[ERROR]
Traceback (most recent call last):
File "C:\Users\
cdaemon\
for i, entry in enumerate(
File "C:\Users\
\syncdaemon\
for entry in self._iter_
File "C:\Users\
\syncdaemon\
entry, new_pos = self.read(fd)
File "C:\Users\
\syncdaemon\
data = fd.read(
exceptions.
tests.syncdaemo
Guillermo Gonzalez (verterok) wrote : | # |
Thanks for spotting that one.
It's fixed and pushed.
Manuel de la Peña (mandel) wrote : | # |
Everything works in all currently supported platforms!
Ubuntu One Auto Pilot (otto-pilot) wrote : | # |
The attempt to merge lp:~verterok/ubuntuone-client/fix-987376 into lp:ubuntuone-client failed. Below is the output from the failed tests.
/usr/bin/
checking for autoconf >= 2.53...
testing autoconf2.50... not found.
testing autoconf... found 2.68
checking for automake >= 1.10...
testing automake-1.11... found 1.11.3
checking for libtool >= 1.5...
testing libtoolize... found 2.4.2
checking for intltool >= 0.30...
testing intltoolize... found 0.50.2
checking for pkg-config >= 0.14.0...
testing pkg-config... found 0.26
checking for gtk-doc >= 1.0...
testing gtkdocize... found 1.18
Checking for required M4 macros...
Checking for forbidden M4 macros...
Processing ./configure.ac
Running libtoolize...
libtoolize: putting auxiliary files in `.'.
libtoolize: copying file `./ltmain.sh'
libtoolize: putting macros in AC_CONFIG_
libtoolize: copying file `m4/libtool.m4'
libtoolize: copying file `m4/ltoptions.m4'
libtoolize: copying file `m4/ltsugar.m4'
libtoolize: copying file `m4/ltversion.m4'
libtoolize: copying file `m4/lt~obsolete.m4'
Running intltoolize...
Running gtkdocize...
Running aclocal-1.11...
Running autoconf...
Running autoheader...
Running automake-1.11...
Running ./configure --enable-gtk-doc --enable-debug ...
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... no
checking for mawk... mawk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking for style of include used by make... GNU
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking dependency style of gcc... gcc3
checking for library containing strerror... none required
checking for gcc... (cached) gcc
checking whether we are using the GNU C compiler... (cached) yes
checking whether gcc accepts -g... (cached) yes
checking for gcc option to accept ISO C89... (cached) none needed
checking dependency style of gcc... (cached) gcc3
checking build system type... i686-pc-linux-gnu
checking host system type... i686-pc-linux-gnu
checking how to print strings... printf
checking for a sed that does not truncate output... /bin/sed
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for fgrep... /bin/grep -F
checking for ld used by gcc... /usr/bin/ld
checking if the linker (/usr/bin/ld) is GNU ld... yes
checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
checking the name lister (/usr/bin/nm -B) interface... BSD nm
checking whether ln -s works... yes
checking the maximum length of command line arguments... 1572864
checking whether the shell understands some XSI constructs... yes
checking whether the shell understands "+="... yes
checking how to con...
Facundo Batista (facundo) wrote : | # |
Tested again latest changes.
Ubuntu One Auto Pilot (otto-pilot) wrote : | # |
The attempt to merge lp:~verterok/ubuntuone-client/fix-987376 into lp:ubuntuone-client failed. Below is the output from the failed tests.
/usr/bin/
checking for autoconf >= 2.53...
testing autoconf2.50... not found.
testing autoconf... found 2.68
checking for automake >= 1.10...
testing automake-1.11... found 1.11.3
checking for libtool >= 1.5...
testing libtoolize... found 2.4.2
checking for intltool >= 0.30...
testing intltoolize... found 0.50.2
checking for pkg-config >= 0.14.0...
testing pkg-config... found 0.26
checking for gtk-doc >= 1.0...
testing gtkdocize... found 1.18
Checking for required M4 macros...
Checking for forbidden M4 macros...
Processing ./configure.ac
Running libtoolize...
libtoolize: putting auxiliary files in `.'.
libtoolize: copying file `./ltmain.sh'
libtoolize: putting macros in AC_CONFIG_
libtoolize: copying file `m4/libtool.m4'
libtoolize: copying file `m4/ltoptions.m4'
libtoolize: copying file `m4/ltsugar.m4'
libtoolize: copying file `m4/ltversion.m4'
libtoolize: copying file `m4/lt~obsolete.m4'
Running intltoolize...
Running gtkdocize...
Running aclocal-1.11...
Running autoconf...
Running autoheader...
Running automake-1.11...
Running ./configure --enable-gtk-doc --enable-debug ...
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... no
checking for mawk... mawk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking for style of include used by make... GNU
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking dependency style of gcc... gcc3
checking for library containing strerror... none required
checking for gcc... (cached) gcc
checking whether we are using the GNU C compiler... (cached) yes
checking whether gcc accepts -g... (cached) yes
checking for gcc option to accept ISO C89... (cached) none needed
checking dependency style of gcc... (cached) gcc3
checking build system type... i686-pc-linux-gnu
checking host system type... i686-pc-linux-gnu
checking how to print strings... printf
checking for a sed that does not truncate output... /bin/sed
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for fgrep... /bin/grep -F
checking for ld used by gcc... /usr/bin/ld
checking if the linker (/usr/bin/ld) is GNU ld... yes
checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
checking the name lister (/usr/bin/nm -B) interface... BSD nm
checking whether ln -s works... yes
checking the maximum length of command line arguments... 1572864
checking whether the shell understands some XSI constructs... yes
checking whether the shell understands "+="... yes
checking how to con...
Unmerged revisions
- 1234. By Guillermo Gonzalez
-
fix tests
- 1233. By Guillermo Gonzalez
-
fix bad_crc tests to run on windows.
- 1232. By Guillermo Gonzalez
-
don't iterate the generator in DataFile.
iter_entries, just return it. - 1231. By Guillermo Gonzalez
-
fix max mmaped file limit
- 1230. By Guillermo Gonzalez
-
only use mmap in data files <= 2**32, to avoid hitting address space limits, and fallback to standard IO.
Preview Diff
1 | === modified file 'tests/syncdaemon/test_tritcask.py' | |||
2 | --- tests/syncdaemon/test_tritcask.py 2012-04-09 20:07:05 +0000 | |||
3 | +++ tests/syncdaemon/test_tritcask.py 2012-05-08 14:03:17 +0000 | |||
4 | @@ -205,8 +205,8 @@ | |||
5 | 205 | db.put(i, 'foo%d' % (i,), 'bar%s' % (i,)) | 205 | db.put(i, 'foo%d' % (i,), 'bar%s' % (i,)) |
6 | 206 | # write a different value -> random bytes | 206 | # write a different value -> random bytes |
7 | 207 | # now write some garbage to the end of file | 207 | # now write some garbage to the end of file |
10 | 208 | db.live_file.fd.write(os.urandom(100)) | 208 | db.live_file.fd.seek(db.live_file.fd.tell()-3) |
11 | 209 | db.live_file.fd.flush() | 209 | db.live_file.fd.write(os.urandom(3)) |
12 | 210 | # and add 10 new entries | 210 | # and add 10 new entries |
13 | 211 | for i in range(10, 20): | 211 | for i in range(10, 20): |
14 | 212 | db.put(i, 'foo%d' % (i,), 'bar%s' % (i,)) | 212 | db.put(i, 'foo%d' % (i,), 'bar%s' % (i,)) |
15 | @@ -290,7 +290,7 @@ | |||
16 | 290 | with contextlib.closing(fmap): | 290 | with contextlib.closing(fmap): |
17 | 291 | current_pos = 0 | 291 | current_pos = 0 |
18 | 292 | (crc32, tstamp, key_sz, value_sz, row_type, | 292 | (crc32, tstamp, key_sz, value_sz, row_type, |
20 | 293 | key, value, pos), new_pos = data_file.read(fmap, current_pos) | 293 | key, value, pos), new_pos = data_file.read(fmap) |
21 | 294 | current_pos = new_pos | 294 | current_pos = new_pos |
22 | 295 | self.assertEqual(crc32_size+header_size+key_sz+value_sz, new_pos) | 295 | self.assertEqual(crc32_size+header_size+key_sz+value_sz, new_pos) |
23 | 296 | self.assertEqual(orig_tstamp, tstamp) | 296 | self.assertEqual(orig_tstamp, tstamp) |
24 | @@ -299,7 +299,7 @@ | |||
25 | 299 | self.assertEqual('bar', value) | 299 | self.assertEqual('bar', value) |
26 | 300 | self.assertEqual(0, row_type) | 300 | self.assertEqual(0, row_type) |
27 | 301 | (crc32, tstamp, key_sz, value_sz, row_type, | 301 | (crc32, tstamp, key_sz, value_sz, row_type, |
29 | 302 | key, value, pos), new_pos = data_file.read(fmap, current_pos) | 302 | key, value, pos), new_pos = data_file.read(fmap) |
30 | 303 | self.assertEqual( | 303 | self.assertEqual( |
31 | 304 | crc32_size+header_size+key_sz+value_sz+current_pos, new_pos) | 304 | crc32_size+header_size+key_sz+value_sz+current_pos, new_pos) |
32 | 305 | self.assertEqual(tstamp1, tstamp) | 305 | self.assertEqual(tstamp1, tstamp) |
33 | @@ -323,7 +323,7 @@ | |||
34 | 323 | fd.flush() | 323 | fd.flush() |
35 | 324 | fmap = mmap.mmap(fd.fileno(), 0, access=mmap.ACCESS_READ) | 324 | fmap = mmap.mmap(fd.fileno(), 0, access=mmap.ACCESS_READ) |
36 | 325 | with contextlib.closing(fmap): | 325 | with contextlib.closing(fmap): |
38 | 326 | self.assertRaises(BadCrc, data_file.read, fmap, 0) | 326 | self.assertRaises(BadCrc, data_file.read, fmap) |
39 | 327 | 327 | ||
40 | 328 | def test_read_bad_header(self): | 328 | def test_read_bad_header(self): |
41 | 329 | """Test for read method with a bad header/unpack error.""" | 329 | """Test for read method with a bad header/unpack error.""" |
42 | @@ -340,7 +340,14 @@ | |||
43 | 340 | fd.flush() | 340 | fd.flush() |
44 | 341 | fmap = mmap.mmap(fd.fileno(), 0, access=mmap.ACCESS_READ) | 341 | fmap = mmap.mmap(fd.fileno(), 0, access=mmap.ACCESS_READ) |
45 | 342 | with contextlib.closing(fmap): | 342 | with contextlib.closing(fmap): |
47 | 343 | self.assertRaises(BadHeader, data_file.read, fmap, 0) | 343 | self.assertRaises(BadHeader, data_file.read, fmap) |
48 | 344 | |||
49 | 345 | |||
50 | 346 | class NoMmapDataFileTest(DataFileTest): | ||
51 | 347 | |||
52 | 348 | def setUp(self): | ||
53 | 349 | self.patch(DataFile, 'max_mmap_size', 1) | ||
54 | 350 | return super(NoMmapDataFileTest, self).setUp() | ||
55 | 344 | 351 | ||
56 | 345 | 352 | ||
57 | 346 | class TempDataFileTest(DataFileTest): | 353 | class TempDataFileTest(DataFileTest): |
58 | @@ -466,10 +473,9 @@ | |||
59 | 466 | db = Tritcask(self.base_dir) | 473 | db = Tritcask(self.base_dir) |
60 | 467 | for i in range(10): | 474 | for i in range(10): |
61 | 468 | db.put(i, 'foo%d' % (i,), 'bar%s' % (i,)) | 475 | db.put(i, 'foo%d' % (i,), 'bar%s' % (i,)) |
66 | 469 | # write a different value -> random bytes | 476 | # write a different value to the last item. |
67 | 470 | # now write some garbage to the end of file | 477 | db.live_file.fd.seek(db.live_file.fd.tell()-3) |
68 | 471 | db.live_file.fd.write(os.urandom(100)) | 478 | db.live_file.fd.write(os.urandom(3)) |
65 | 472 | db.live_file.fd.flush() | ||
69 | 473 | # and add 10 new entries | 479 | # and add 10 new entries |
70 | 474 | for i in range(10, 20): | 480 | for i in range(10, 20): |
71 | 475 | db.put(i, 'foo%d' % (i,), 'bar%s' % (i,)) | 481 | db.put(i, 'foo%d' % (i,), 'bar%s' % (i,)) |
72 | @@ -567,6 +573,25 @@ | |||
73 | 567 | self.assertEqual(data_file.hint_size, len("some data")) | 573 | self.assertEqual(data_file.hint_size, len("some data")) |
74 | 568 | 574 | ||
75 | 569 | 575 | ||
76 | 576 | class NoMmapImmutableDataFileTest(ImmutableDataFileTest): | ||
77 | 577 | |||
78 | 578 | def setUp(self): | ||
79 | 579 | self.patch(ImmutableDataFile, 'max_mmap_size', 1) | ||
80 | 580 | return super(NoMmapImmutableDataFileTest, self).setUp() | ||
81 | 581 | |||
82 | 582 | def test__open(self): | ||
83 | 583 | """Test the _open private method.""" | ||
84 | 584 | new_file = DataFile(self.base_dir) | ||
85 | 585 | # write some data | ||
86 | 586 | new_file.fd.write('foo') | ||
87 | 587 | immutable_file = new_file.make_immutable() | ||
88 | 588 | self.assertTrue(immutable_file.fd is not None) | ||
89 | 589 | self.assertTrue(immutable_file.fmmap is None) | ||
90 | 590 | # check that the file is opened only for read | ||
91 | 591 | self.assertRaises(IOError, immutable_file.fd.write, 'foo') | ||
92 | 592 | immutable_file.close() | ||
93 | 593 | |||
94 | 594 | |||
95 | 570 | class DeadDataFileTest(ImmutableDataFileTest): | 595 | class DeadDataFileTest(ImmutableDataFileTest): |
96 | 571 | """Tests for DeadDataFile.""" | 596 | """Tests for DeadDataFile.""" |
97 | 572 | 597 | ||
98 | @@ -1277,9 +1302,10 @@ | |||
99 | 1277 | for i in range(10): | 1302 | for i in range(10): |
100 | 1278 | db.put(i, 'foo%d' % (i,), 'bar%s' % (i,)) | 1303 | db.put(i, 'foo%d' % (i,), 'bar%s' % (i,)) |
101 | 1279 | self.assertFalse(db.should_rotate()) | 1304 | self.assertFalse(db.should_rotate()) |
105 | 1280 | # write a different value -> random bytes | 1305 | # write a different value to the last item. |
106 | 1281 | # now write some garbage to the end of file | 1306 | # write a different value to the last item. |
107 | 1282 | db.live_file.fd.write(os.urandom(100)) | 1307 | db.live_file.fd.seek(db.live_file.fd.tell()-5) |
108 | 1308 | db.live_file.fd.write(os.urandom(5)) | ||
109 | 1283 | db.live_file.fd.flush() | 1309 | db.live_file.fd.flush() |
110 | 1284 | db.shutdown() | 1310 | db.shutdown() |
111 | 1285 | called = [] | 1311 | called = [] |
112 | @@ -1885,3 +1911,4 @@ | |||
113 | 1885 | """Test that the initial value is > 0.""" | 1911 | """Test that the initial value is > 0.""" |
114 | 1886 | timer = WindowsTimer() | 1912 | timer = WindowsTimer() |
115 | 1887 | self.assertTrue(int(timer.time()) > 0) | 1913 | self.assertTrue(int(timer.time()) > 0) |
116 | 1914 | |||
117 | 1888 | 1915 | ||
118 | === modified file 'ubuntuone/syncdaemon/tritcask.py' | |||
119 | --- ubuntuone/syncdaemon/tritcask.py 2012-04-09 20:07:05 +0000 | |||
120 | +++ ubuntuone/syncdaemon/tritcask.py 2012-05-08 14:03:17 +0000 | |||
121 | @@ -74,10 +74,6 @@ | |||
122 | 74 | VERSION = 'v1' | 74 | VERSION = 'v1' |
123 | 75 | FILE_SUFFIX = '.tritcask-%s.data' % VERSION | 75 | FILE_SUFFIX = '.tritcask-%s.data' % VERSION |
124 | 76 | 76 | ||
125 | 77 | EXTRA_SEEK = False | ||
126 | 78 | if sys.platform == 'win32': | ||
127 | 79 | EXTRA_SEEK = True | ||
128 | 80 | |||
129 | 81 | logger = logging.getLogger('ubuntuone.SyncDaemon.tritcask') | 77 | logger = logging.getLogger('ubuntuone.SyncDaemon.tritcask') |
130 | 82 | 78 | ||
131 | 83 | 79 | ||
132 | @@ -182,6 +178,7 @@ | |||
133 | 182 | """Class that encapsulates data file handling.""" | 178 | """Class that encapsulates data file handling.""" |
134 | 183 | 179 | ||
135 | 184 | last_generated_id = 0 | 180 | last_generated_id = 0 |
136 | 181 | max_mmap_size = int(2**31-1) # cause 2**31 it's a long in 32bits. | ||
137 | 185 | 182 | ||
138 | 186 | def __init__(self, base_path, filename=None): | 183 | def __init__(self, base_path, filename=None): |
139 | 187 | """Create a DataFile instance. | 184 | """Create a DataFile instance. |
140 | @@ -250,18 +247,23 @@ | |||
141 | 250 | self.fd = None | 247 | self.fd = None |
142 | 251 | 248 | ||
143 | 252 | def iter_entries(self): | 249 | def iter_entries(self): |
148 | 253 | """Return a generator for the entries in the file.""" | 250 | """Return a generator for the entries in the file using mmap.""" |
149 | 254 | fmmap = mmap.mmap(self.fd.fileno(), 0, access=mmap.ACCESS_READ) | 251 | if self.size >= self.max_mmap_size: |
150 | 255 | with contextlib.closing(fmmap): | 252 | for entry in self._iter_entries(self.fd): |
147 | 256 | for entry in self._iter_mmaped_entries(fmmap): | ||
151 | 257 | yield entry | 253 | yield entry |
152 | 254 | else: | ||
153 | 255 | fmmap = mmap.mmap(self.fd.fileno(), 0, access=mmap.ACCESS_READ) | ||
154 | 256 | with contextlib.closing(fmmap): | ||
155 | 257 | for entry in self._iter_entries(fmmap): | ||
156 | 258 | yield entry | ||
157 | 258 | 259 | ||
159 | 259 | def _iter_mmaped_entries(self, fmmap): | 260 | def _iter_entries(self, fd): |
160 | 260 | """Return a generator for the entries in the mmaped file.""" | 261 | """Return a generator for the entries in the mmaped file.""" |
161 | 261 | current_pos = 0 | 262 | current_pos = 0 |
162 | 263 | fd.seek(current_pos) | ||
163 | 262 | while True: | 264 | while True: |
164 | 263 | try: | 265 | try: |
166 | 264 | entry, new_pos = self.read(fmmap, current_pos) | 266 | entry, new_pos = self.read(fd) |
167 | 265 | current_pos = new_pos | 267 | current_pos = new_pos |
168 | 266 | yield entry | 268 | yield entry |
169 | 267 | except EOFError: | 269 | except EOFError: |
170 | @@ -294,10 +296,8 @@ | |||
171 | 294 | tstamp = timestamp() | 296 | tstamp = timestamp() |
172 | 295 | header = header_struct.pack(tstamp, key_sz, value_sz, row_type) | 297 | header = header_struct.pack(tstamp, key_sz, value_sz, row_type) |
173 | 296 | crc32 = crc32_struct.pack(zlib.crc32(header + key + value)) | 298 | crc32 = crc32_struct.pack(zlib.crc32(header + key + value)) |
178 | 297 | if EXTRA_SEEK: | 299 | # always go to the EOF before write, as we aren't using mmap any more. |
179 | 298 | # seek to end of file even if we are in append mode, but py2.x IO | 300 | self.fd.seek(0, os.SEEK_END) |
176 | 299 | # in win32 is really buggy, see: http://bugs.python.org/issue3207 | ||
177 | 300 | self.fd.seek(0, os.SEEK_END) | ||
180 | 301 | self.fd.write(crc32 + header) | 301 | self.fd.write(crc32 + header) |
181 | 302 | self.fd.write(key) | 302 | self.fd.write(key) |
182 | 303 | value_pos = self.fd.tell() | 303 | value_pos = self.fd.tell() |
183 | @@ -305,12 +305,13 @@ | |||
184 | 305 | self.fd.flush() | 305 | self.fd.flush() |
185 | 306 | return tstamp, value_pos, value_sz | 306 | return tstamp, value_pos, value_sz |
186 | 307 | 307 | ||
188 | 308 | def read(self, fmmap, current_pos): | 308 | def read(self, fd): |
189 | 309 | """Read a single entry from the current position.""" | 309 | """Read a single entry from the current position.""" |
194 | 310 | crc32_bytes = fmmap[current_pos:current_pos + crc32_size] | 310 | current_pos = fd.tell() |
195 | 311 | current_pos += crc32_size | 311 | data = fd.read(crc32_size+header_size) |
196 | 312 | header = fmmap[current_pos:current_pos + header_size] | 312 | current_pos += crc32_size+header_size |
197 | 313 | current_pos += header_size | 313 | crc32_bytes = data[:crc32_size] |
198 | 314 | header = data[crc32_size:] | ||
199 | 314 | if header == '' or crc32_bytes == '': | 315 | if header == '' or crc32_bytes == '': |
200 | 315 | # reached EOF | 316 | # reached EOF |
201 | 316 | raise EOFError | 317 | raise EOFError |
202 | @@ -319,10 +320,11 @@ | |||
203 | 319 | tstamp, key_sz, value_sz, row_type = header_struct.unpack(header) | 320 | tstamp, key_sz, value_sz, row_type = header_struct.unpack(header) |
204 | 320 | except struct.error, e: | 321 | except struct.error, e: |
205 | 321 | raise BadHeader(e) | 322 | raise BadHeader(e) |
207 | 322 | key = fmmap[current_pos:current_pos + key_sz] | 323 | data = fd.read(key_sz+value_sz) |
208 | 324 | key = data[:key_sz] | ||
209 | 323 | current_pos += key_sz | 325 | current_pos += key_sz |
210 | 324 | value_pos = current_pos | 326 | value_pos = current_pos |
212 | 325 | value = fmmap[current_pos:current_pos + value_sz] | 327 | value = data[key_sz:] |
213 | 326 | current_pos += value_sz | 328 | current_pos += value_sz |
214 | 327 | # verify the crc32 of the data | 329 | # verify the crc32 of the data |
215 | 328 | if zlib.crc32(header + key + value) == crc32: | 330 | if zlib.crc32(header + key + value) == crc32: |
216 | @@ -370,8 +372,9 @@ | |||
217 | 370 | 372 | ||
218 | 371 | def _open(self): | 373 | def _open(self): |
219 | 372 | self.fd = open(self.filename, 'rb') | 374 | self.fd = open(self.filename, 'rb') |
222 | 373 | fmmap = mmap.mmap(self.fd.fileno(), 0, access=mmap.ACCESS_READ) | 375 | if self.size < self.max_mmap_size: |
223 | 374 | self.fmmap = fmmap | 376 | fmmap = mmap.mmap(self.fd.fileno(), 0, access=mmap.ACCESS_READ) |
224 | 377 | self.fmmap = fmmap | ||
225 | 375 | 378 | ||
226 | 376 | def close(self): | 379 | def close(self): |
227 | 377 | """Close the file descriptor and mmap.""" | 380 | """Close the file descriptor and mmap.""" |
228 | @@ -387,13 +390,20 @@ | |||
229 | 387 | 390 | ||
230 | 388 | def iter_entries(self): | 391 | def iter_entries(self): |
231 | 389 | """Return a generator for the entries in the mmaped file.""" | 392 | """Return a generator for the entries in the mmaped file.""" |
234 | 390 | for entry in self._iter_mmaped_entries(self.fmmap): | 393 | fd = self.fd |
235 | 391 | yield entry | 394 | if self.fmmap is not None: |
236 | 395 | fd = self.fmmap | ||
237 | 396 | return self._iter_entries(fd) | ||
238 | 392 | 397 | ||
239 | 393 | def __getitem__(self, item): | 398 | def __getitem__(self, item): |
240 | 394 | """__getitem__ to support slicing and *only* slicing.""" | 399 | """__getitem__ to support slicing and *only* slicing.""" |
241 | 395 | if isinstance(item, slice): | 400 | if isinstance(item, slice): |
243 | 396 | return self.fmmap[item] | 401 | # if we have an mmap, use it. |
244 | 402 | if self.fmmap is not None: | ||
245 | 403 | return self.fmmap[item] | ||
246 | 404 | else: | ||
247 | 405 | self.fd.seek(item.start) | ||
248 | 406 | return self.fd.read(item.stop - item.start) | ||
249 | 397 | else: | 407 | else: |
250 | 398 | raise ValueError('Only slice is supported') | 408 | raise ValueError('Only slice is supported') |
251 | 399 | 409 |
Thanks!