Merge lp:~weyrick/bzr/54624-warn-on-large-files into lp:bzr
- 54624-warn-on-large-files
- Merge into bzr.dev
Status: | Merged | ||||
---|---|---|---|---|---|
Approved by: | Jonathan Riddell | ||||
Approved revision: | no longer in the source branch. | ||||
Merged at revision: | 6086 | ||||
Proposed branch: | lp:~weyrick/bzr/54624-warn-on-large-files | ||||
Merge into: | lp:bzr | ||||
Diff against target: |
336 lines (+169/-10) 9 files modified
bzrlib/add.py (+44/-0) bzrlib/builtins.py (+5/-1) bzrlib/config.py (+40/-0) bzrlib/help_topics/en/configuration.txt (+8/-0) bzrlib/mutabletree.py (+13/-6) bzrlib/osutils.py (+6/-3) bzrlib/tests/blackbox/test_add.py (+24/-0) bzrlib/tests/test_config.py (+20/-0) doc/en/release-notes/bzr-2.5.txt (+9/-0) |
||||
To merge this branch: | bzr merge lp:~weyrick/bzr/54624-warn-on-large-files | ||||
Related bugs: |
|
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
martin suchanek (community) | Needs Fixing | ||
Jonathan Riddell (community) | Needs Fixing | ||
Jelmer Vernooij (community) | Needs Information | ||
Review via email: mp+70691@code.launchpad.net |
Commit message
bzr add now skips large files in recursive mode. The default "large"
size is 20MB, and is configurable via the add.maximum_
option. A value of 0 disables skipping. Named items passed to add are
never skipped. (Shannon Weyrick, #54624)
Description of the change
This branch proposes to add the feature requested in bug #54624: to skip (with warning) on large file addition, with configurable "large" threshold.
Robert Collins (lifeless) wrote : | # |
Shannon Weyrick (weyrick) wrote : | # |
I thought about this, and in the end I did 1 MB simply because that's
what the original bug mentioned. I wonder, what's the best way to get
a feel about a reasonable default from the current user base? Mailing
list?
On Sun, Aug 7, 2011 at 8:37 PM, Robert Collins
<email address hidden> wrote:
> 1MB might be a little small :)
>
> --
> https:/
> You are the owner of lp:~weyrick/bzr/54624-warn-on-large-files.
>
Shannon Weyrick (weyrick) wrote : | # |
A nice way to figure out the default would be to get a histogram of file sizes in the bzr repos hosted on launchpad. Then we could pick something towards the top end of the decline in the curve. I don't suppose that info is readily available....
Robert Collins (lifeless) wrote : | # |
On Mon, Aug 8, 2011 at 1:37 PM, Shannon Weyrick <email address hidden> wrote:
> A nice way to figure out the default would be to get a histogram of file sizes in the bzr repos hosted on launchpad. Then we could pick something towards the top end of the decline in the curve. I don't suppose that info is readily available....
Not preprocessed, but the vast majority of branches are public, so you
should be able to use bzrlib's API to read the inventories of those
branches - just the tip revision might be sufficient.
-Rob
Martin Pool (mbp) wrote : | # |
Thanks for this Shannon, I'm sure it will save some users some grief.
I think the default 'large' should probably be set at the kind of size
where there's a fair chance people would regret adding it, because of
the impact on disk/memory usage/transfer time. 1MB shouldn't be a
problem for any of them and I think it's fairly likely people will
have assets that big (if not actual source files.) Computers are
bigger than they were when Vesta was new...
I think say 50MB would be nearer the mark.
I don't think it's especially worth scanning Launchpad branches unless
you really want to. You might just as well look in your own home
directory or perhaps ask on the list.
As far as implementation:
+ _DEFAULT_
[nit] we don't normally have semicolons at the end of statements
[optional] A lot of _large_
configuration code for "a non-negative integer with a default". You
could at least hoist this out; eventually it might be good if the
default was declared along with the option name.
[fix] Our convention now is that option names are grouped into the
area they relate to; I think this is only for add so it should be
probably add.large_
obvious as something like add.maximum_
+ os.path.
[optional] This will do another system call to get the size and we
probably already have it and can use the cached value. But it will be
a cheap call, since the inode should already in cache.
+ "large_
[tweak] For some reason we always say %d not %i.
[fix] This ought to be described in help_topics/
[fix] This probably also needs to be in the user documentation; I'm
not sure off hand where would be most appropriate.
Martin
Shannon Weyrick (weyrick) wrote : | # |
> Thanks for this Shannon, I'm sure it will save some users some grief.
>
> I think the default 'large' should probably be set at the kind of size
> where there's a fair chance people would regret adding it, because of
> the impact on disk/memory usage/transfer time. 1MB shouldn't be a
> problem for any of them and I think it's fairly likely people will
> have assets that big (if not actual source files.) Computers are
> bigger than they were when Vesta was new...
>
> I think say 50MB would be nearer the mark.
>
> I don't think it's especially worth scanning Launchpad branches unless
> you really want to. You might just as well look in your own home
> directory or perhaps ask on the list.
>
I agree 1 MB is almost definitely too low, otoh 50 sounds high to me. I was thinking something on the order of 10-20MB - that almost definitely covers all source files and most normal resources from web projects as well. But, I'm not stuck on that number.
What would be real handy here is the first time this limit was reached, it could prompt the user if they wanted to raise/set it. Is there infrastructure in place in the ui for that kind of thing?
> As far as implementation:
>
> + _DEFAULT_
>
> [nit] we don't normally have semicolons at the end of statements
>
np. obviously my c-like lang background showing here :)
> [optional] A lot of _large_
> configuration code for "a non-negative integer with a default". You
> could at least hoist this out; eventually it might be good if the
> default was declared along with the option name.
>
Actually, I think what we really want is to not force the user to enter an integer for this, but rather "20M" or "1G". I could whip something up for this and stick it in Config.
Is there a central place defaults/option names are defined? I didn't see one.
> [fix] Our convention now is that option names are grouped into the
> area they relate to; I think this is only for add so it should be
> probably add.large_
> obvious as something like add.maximum_
>
> + os.path.
>
Yes, maximum_file_size sounds better.
> [optional] This will do another system call to get the size and we
> probably already have it and can use the cached value. But it will be
> a cheap call, since the inode should already in cache.
>
I thought of this as well, and thought it should be cached on some level as the size is/will be used in the add. But I'll investigate using the bzr level cache - which I think I saw in the tree? I'll check.
> + "large_
>
> [tweak] For some reason we always say %d not %i.
>
> [fix] This ought to be described in help_topics/
>
> [fix] This probably also needs to be in the user documentation; I'm
> not sure off hand where would be most appropriate.
>
No prob on these items.
Thanks for the help! I'll work on these changes as I have time over the next few days.
Shannon
Jelmer Vernooij (jelmer) wrote : | # |
Should this be this close to the core? I wonder if it makes more sense as a UI-level thing that's just part of "bzr add".
With the current approach, we'd have to introduce workarounds for things like the UDD importer, but I presume there will also be a fair number of packages for which "bzr merge-upstream" will now fail.
It also means all custom tree implementations will have to explicitly add support.
Jelmer Vernooij (jelmer) : | # |
Martin Pool (mbp) wrote : | # |
@jelmer Of course you're right, i'm glad you caught this. Perhaps the interface by which add reports back to the ui (I think there is a result or something) would be better.
Martin Pool (mbp) wrote : | # |
On 9 August 2011 01:01, Shannon Weyrick <email address hidden> wrote:
>> Thanks for this Shannon, I'm sure it will save some users some grief.
>>
>> I think the default 'large' should probably be set at the kind of size
>> where there's a fair chance people would regret adding it, because of
>> the impact on disk/memory usage/transfer time. 1MB shouldn't be a
>> problem for any of them and I think it's fairly likely people will
>> have assets that big (if not actual source files.) Computers are
>> bigger than they were when Vesta was new...
>>
>> I think say 50MB would be nearer the mark.
>>
>> I don't think it's especially worth scanning Launchpad branches unless
>> you really want to. You might just as well look in your own home
>> directory or perhaps ask on the list.
>>
>
> I agree 1 MB is almost definitely too low, otoh 50 sounds high to me. I was thinking something on the order of 10-20MB - that almost definitely covers all source files and most normal resources from web projects as well. But, I'm not stuck on that number.
10-20MB is also fine with me.
> What would be real handy here is the first time this limit was reached, it could prompt the user if they wanted to raise/set it. Is there infrastructure in place in the ui for that kind of thing?
There is some infrastructure towards that - actually this reminds me
of something I should have said before, which is that rather than
trace.warning you should use ui_factory.warning - that takes a
structure identifier of the type of warning, with the idea that
eventually we can give the user an option to not show the warning any
more, or perhaps to change it.
> Actually, I think what we really want is to not force the user to enter an integer for this, but rather "20M" or "1G". I could whip something up for this and stick it in Config.
That would be great.
> Is there a central place defaults/option names are defined? I didn't see one.
There is option_registry in config.py.
Robert Collins (lifeless) wrote : | # |
On performance: we've found add to be pretty sensitive to syscall
volumes these days; I would personally want a test on a 50K file tree
before discounting the impact of additional syscalls.
Shannon Weyrick (weyrick) wrote : | # |
Ok, thanks for the feedback everyone. To summarize then, here are my goals for completing this:
1) Ensure that skipping a large file doesn't happen directly in MutableTree.
2) Ensure that the stat for filesize is cached. Benchmark on large import.
3) Add a simple parser for specifying values in "human readable" sizes (i.e. "500M") to Config
4) Default limit to 20M
5) Possibly prompt user if the limit is hit and they haven't overridden the default
6) Use config name "add.maximum_
7) Document in configuration.txt and user docs
Shannon Weyrick (weyrick) wrote : | # |
All the goals have been met in the current patch, except 5 and 2. 5 I'm not doing for now. 2 is a larger problem - some informal tests importing large trees show about 23% slowdown with the extra stat. Unfortunately I don't see a simple path towards sharing stat calls - they seem to be spread throughout the code now with no central way to cache. I'm looking into this further now.
John A Meinel (jameinel) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 8/12/2011 4:55 PM, Shannon Weyrick wrote:
> All the goals have been met in the current patch, except 5 and 2. 5
> I'm not doing for now. 2 is a larger problem - some informal tests
> importing large trees show about 23% slowdown with the extra stat.
> Unfortunately I don't see a simple path towards sharing stat calls -
> they seem to be spread throughout the code now with no central way to
> cache. I'm looking into this further now.
>
Is it possible to put it in as part of get_file_with_stat or something
along those lines? That uses 'fstat' to make sure to stat the object
that is already in memory.
I do realize that add does things slightly differently, however we
already have the stat object around there, because we need to know if
the object is a file or a directory.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAk5
GwQAn0xfkHedGsN
=S28e
-----END PGP SIGNATURE-----
Shannon Weyrick (weyrick) wrote : | # |
>
> Is it possible to put it in as part of get_file_with_stat or something
> along those lines? That uses 'fstat' to make sure to stat the object
> that is already in memory.
>
> I do realize that add does things slightly differently, however we
> already have the stat object around there, because we need to know if
> the object is a file or a directory.
>
Yes currently there's a stat to determine the file kind, which is using osutils.file_kind. Actually, it only does this if the inventory entry didn't have the kind, but I haven't hit a case where that's true in my tests. But I don't see a trivial way to have the file size check share the stat with the one from file_kind ... So it seems add would have to use something other than osutils.file_kind. If get_file_with_stat will work for both, that may do it. I'll take a look, thanks.
Shannon
Shannon Weyrick (weyrick) wrote : | # |
get_file_with_stat did not seem applicable, because it required a file_id which was not yet available since the file isn't added yet. Instead, I added a file_stat function (which may potentially cache results, but doesn't currently), and in turn made file_kind use this. _SmartAddHelper.add now uses file_stat which can pass the results on to the new AddAction. Therefore, there is a single stat call and my tests on a large tree (linux kernel, ~39k files) show these changes to be only about .01 - .03 seconds slower than bzr.dev on my system.
So, I believe all goals have been met, please review.
Shannon
Jelmer Vernooij (jelmer) wrote : | # |
Hi Shannon,
Thanks for you patience with this. I think this is broadly ok, though I have a few more minor nitpicks:
We don't use camel casing for method names, so skipFile should be skip_file.
I'm not sure if it's part of the syntax guide, but we generally don't add parentheses where they aren't necessary, e.g. in if statements.
Please don't use relative imports - i.e. "from bzrlib import ui" rather than "import ui".
except without arguments can easily hide bugs (it also catches e.g. NameError) - or "eat" things like KeyboardInterrupts, making it impossible to use Ctrl+C. Please catch specific exceptions instead.
Shannon Weyrick (weyrick) wrote : | # |
Jelmer, thanks I've made those changes. Let me know if there's something else.
Shannon
Jonathan Riddell (jr) wrote : | # |
Unfortunately this doesn't work if you specify the file name on the bzr add command, in this case the code path seems to go through mutabletree.py _SmartAddHelper
Also I think there should be a command line way to force adding large files (although that don't need to be part of this patch).
Shannon Weyrick (weyrick) wrote : | # |
Hi Jonathan,
I think your second comment is solved by the first. By that I mean, I
believe if you specify the file on the command line, it should never
be skipped because it's large because that's never what you'd want
(actually maybe a tab completion mistake, but that's not likely
enough). Therefore, an effective way to force adding a large file is
simply to specify it on the command line.
So if you had said "bzr add ." and it told you it skipped a few files,
you could follow it up with "bzr add foo bar" to get the large ones
you really wanted. Seems like a decent work flow to me, and solves the
problem of accidentally adding large files during a recursive import,
while still allowing flexibility with both the config option and the
explicit adds.
Shannon
On Wed, Aug 17, 2011 at 7:41 AM, Jonathan Riddell <email address hidden> wrote:
> Review: Needs Fixing
> Unfortunately this doesn't work if you specify the file name on the bzr add command, in this case the code path seems to go through mutabletree.py _SmartAddHelper
>
> Also I think there should be a command line way to force adding large files (although that don't need to be part of this patch).
>
> --
> https:/
> You are the owner of lp:~weyrick/bzr/54624-warn-on-large-files.
>
Jonathan Riddell (jr) wrote : | # |
sent to pqm by email
Jonathan Riddell (jr) wrote : | # |
sent to pqm by email
kesten broughton (dathomir) wrote : | # |
Hi all,
I got bit by this bug, so thanks a lot for fixing it. A couple of thoughts from a newbie:
On file size, the problem might not be one large file - 50 Mb, but 100 1Mb files, say in a photo album.
As i thought about it, i wonder if the large multi-media files which might be assets for a code project belong in a code revision repo. You don't really change photos or movies the way you would code and none of the merging, etc works for video. Perhaps a tutorial suggesting different ways to back up non-code components of a project would be useful.
On the other hand, documentation and video tutorials might belong in a code dev repository. I guess I'm wondering if anyone has thought this through and come up with an elegant solution.
thanks again all,
kesten
martin suchanek (martin-suc) wrote : | # |
Hi,
I am newbie. Could please anybody tell me how/where to setup add.maximum_
because I have too many errors like following:
bzr: warning: skipping /etc/apache2/
files are bigger than 65 Mbytes.
thank you,
kind regards,
M.
Vincent Ladeuil (vila) wrote : | # |
'bzr add -Oadd.maximum_
Preview Diff
1 | === modified file 'bzrlib/add.py' |
2 | --- bzrlib/add.py 2011-06-14 02:21:41 +0000 |
3 | +++ bzrlib/add.py 2011-08-19 22:28:34 +0000 |
4 | @@ -17,9 +17,11 @@ |
5 | """Helper functions for adding files to working trees.""" |
6 | |
7 | import sys |
8 | +import os |
9 | |
10 | from bzrlib import ( |
11 | osutils, |
12 | + ui, |
13 | ) |
14 | |
15 | |
16 | @@ -53,6 +55,48 @@ |
17 | self._to_file.write('adding %s\n' % _quote(path)) |
18 | return None |
19 | |
20 | + def skip_file(self, tree, path, kind, stat_value = None): |
21 | + """Test whether the given file should be skipped or not. |
22 | + |
23 | + The default action never skips. Note this is only called during |
24 | + recursive adds |
25 | + |
26 | + :param tree: The tree we are working in |
27 | + :param path: The path being added |
28 | + :param kind: The kind of object being added. |
29 | + :param stat: Stat result for this file, if available already |
30 | + :return bool. True if the file should be skipped (not added) |
31 | + """ |
32 | + return False |
33 | + |
34 | + |
35 | +class AddWithSkipLargeAction(AddAction): |
36 | + """A class that can decide to skip a file if it's considered too large""" |
37 | + |
38 | + # default 20 MB |
39 | + _DEFAULT_MAX_FILE_SIZE = 20000000 |
40 | + _optionName = 'add.maximum_file_size' |
41 | + _maxSize = None |
42 | + |
43 | + def skip_file(self, tree, path, kind, stat_value = None): |
44 | + if kind != 'file': |
45 | + return False |
46 | + if self._maxSize is None: |
47 | + config = tree.branch.get_config() |
48 | + self._maxSize = config.get_user_option_as_int_from_SI( |
49 | + self._optionName, |
50 | + self._DEFAULT_MAX_FILE_SIZE) |
51 | + if stat_value is None: |
52 | + file_size = os.path.getsize(path); |
53 | + else: |
54 | + file_size = stat_value.st_size; |
55 | + if self._maxSize > 0 and file_size > self._maxSize: |
56 | + ui.ui_factory.show_warning( |
57 | + "skipping %s (larger than %s of %d bytes)" % |
58 | + (path, self._optionName, self._maxSize)) |
59 | + return True |
60 | + return False |
61 | + |
62 | |
63 | class AddFromBaseAction(AddAction): |
64 | """This class will try to extract file ids from another tree.""" |
65 | |
66 | === modified file 'bzrlib/builtins.py' |
67 | --- bzrlib/builtins.py 2011-08-18 04:23:06 +0000 |
68 | +++ bzrlib/builtins.py 2011-08-19 22:28:34 +0000 |
69 | @@ -674,6 +674,10 @@ |
70 | |
71 | Any files matching patterns in the ignore list will not be added |
72 | unless they are explicitly mentioned. |
73 | + |
74 | + In recursive mode, files larger than the configuration option |
75 | + add.maximum_file_size will be skipped. Named items are never skipped due |
76 | + to file size. |
77 | """ |
78 | takes_args = ['file*'] |
79 | takes_options = [ |
80 | @@ -706,7 +710,7 @@ |
81 | action = bzrlib.add.AddFromBaseAction(base_tree, base_path, |
82 | to_file=self.outf, should_print=(not is_quiet())) |
83 | else: |
84 | - action = bzrlib.add.AddAction(to_file=self.outf, |
85 | + action = bzrlib.add.AddWithSkipLargeAction(to_file=self.outf, |
86 | should_print=(not is_quiet())) |
87 | |
88 | if base_tree: |
89 | |
90 | === modified file 'bzrlib/config.py' |
91 | --- bzrlib/config.py 2011-08-16 15:12:39 +0000 |
92 | +++ bzrlib/config.py 2011-08-19 22:28:34 +0000 |
93 | @@ -75,6 +75,7 @@ |
94 | import os |
95 | import string |
96 | import sys |
97 | +import re |
98 | |
99 | |
100 | from bzrlib.decorators import needs_write_lock |
101 | @@ -413,6 +414,45 @@ |
102 | # add) the final ',' |
103 | l = [l] |
104 | return l |
105 | + |
106 | + def get_user_option_as_int_from_SI(self, option_name, default=None): |
107 | + """Get a generic option from a human readable size in SI units, e.g 10MB |
108 | + |
109 | + Accepted suffixes are K,M,G. It is case-insensitive and may be followed |
110 | + by a trailing b (i.e. Kb, MB). This is intended to be practical and not |
111 | + pedantic. |
112 | + |
113 | + :return Integer, expanded to its base-10 value if a proper SI unit is |
114 | + found. If the option doesn't exist, or isn't a value in |
115 | + SI units, return default (which defaults to None) |
116 | + """ |
117 | + val = self.get_user_option(option_name) |
118 | + if isinstance(val, list): |
119 | + val = val[0] |
120 | + if val is None: |
121 | + val = default |
122 | + else: |
123 | + p = re.compile("^(\d+)([kmg])*b*$", re.IGNORECASE) |
124 | + try: |
125 | + m = p.match(val) |
126 | + if m is not None: |
127 | + val = int(m.group(1)) |
128 | + if m.group(2) is not None: |
129 | + if m.group(2).lower() == 'k': |
130 | + val *= 10**3 |
131 | + elif m.group(2).lower() == 'm': |
132 | + val *= 10**6 |
133 | + elif m.group(2).lower() == 'g': |
134 | + val *= 10**9 |
135 | + else: |
136 | + ui.ui_factory.show_warning('Invalid config value for "%s" ' |
137 | + ' value %r is not an SI unit.' |
138 | + % (option_name, val)) |
139 | + val = default |
140 | + except TypeError: |
141 | + val = default |
142 | + return val |
143 | + |
144 | |
145 | def gpg_signing_command(self): |
146 | """What program should be used to sign signatures?""" |
147 | |
148 | === modified file 'bzrlib/help_topics/en/configuration.txt' |
149 | --- bzrlib/help_topics/en/configuration.txt 2011-08-16 15:12:39 +0000 |
150 | +++ bzrlib/help_topics/en/configuration.txt 2011-08-19 22:28:34 +0000 |
151 | @@ -628,6 +628,14 @@ |
152 | If present, defines the ``--strict`` option default value for checking |
153 | uncommitted changes before sending a merge directive. |
154 | |
155 | +add.maximum_file_size |
156 | +~~~~~~~~~~~~~~~~~~~~~ |
157 | + |
158 | +Defines the maximum file size the command line "add" operation will allow |
159 | +in recursive mode, with files larger than this value being skipped. You may |
160 | +specify this value as an integer (in which case it is interpreted as bytes), |
161 | +or you may specify the value using SI units, i.e. 10KB, 20MB, 1G. A value of 0 |
162 | +will disable skipping. |
163 | |
164 | External Merge Tools |
165 | -------------------- |
166 | |
167 | === modified file 'bzrlib/mutabletree.py' |
168 | --- bzrlib/mutabletree.py 2011-07-23 16:33:38 +0000 |
169 | +++ bzrlib/mutabletree.py 2011-08-19 22:28:34 +0000 |
170 | @@ -582,8 +582,9 @@ |
171 | :param parent_ie: Parent inventory entry if known, or None. If |
172 | None, the parent is looked up by name and used if present, otherwise it |
173 | is recursively added. |
174 | + :param path: |
175 | :param kind: Kind of new entry (file, directory, etc) |
176 | - :param action: callback(tree, parent_ie, path, kind); can return file_id |
177 | + :param inv_path: |
178 | :return: Inventory entry for path and a list of paths which have been added. |
179 | """ |
180 | # Nothing to do if path is already versioned. |
181 | @@ -628,7 +629,7 @@ |
182 | if (prev_dir is None or not is_inside([prev_dir], path)): |
183 | yield (path, inv_path, this_ie, None) |
184 | prev_dir = path |
185 | - |
186 | + |
187 | def __init__(self, tree, action, conflicts_related=None): |
188 | self.tree = tree |
189 | if action is None: |
190 | @@ -695,12 +696,18 @@ |
191 | |
192 | # get the contents of this directory. |
193 | |
194 | - # find the kind of the path being added. |
195 | + # find the kind of the path being added, and save stat_value |
196 | + # for reuse |
197 | + stat_value = None |
198 | if this_ie is None: |
199 | - kind = osutils.file_kind(abspath) |
200 | + stat_value = osutils.file_stat(abspath) |
201 | + kind = osutils.file_kind_from_stat_mode(stat_value.st_mode) |
202 | else: |
203 | kind = this_ie.kind |
204 | - |
205 | + |
206 | + # allow AddAction to skip this file |
207 | + if self.action.skip_file(self.tree, abspath, kind, stat_value): |
208 | + continue |
209 | if not InventoryEntry.versionable_kind(kind): |
210 | trace.warning("skipping %s (can't add file of kind '%s')", |
211 | abspath, kind) |
212 | @@ -769,7 +776,7 @@ |
213 | # recurse into this already versioned subdir. |
214 | things_to_add.append((subp, sub_invp, sub_ie, this_ie)) |
215 | else: |
216 | - # user selection overrides ignoes |
217 | + # user selection overrides ignores |
218 | # ignore while selecting files - if we globbed in the |
219 | # outer loop we would ignore user files. |
220 | ignore_glob = self.tree.is_ignored(subp) |
221 | |
222 | === modified file 'bzrlib/osutils.py' |
223 | --- bzrlib/osutils.py 2011-08-12 12:18:34 +0000 |
224 | +++ bzrlib/osutils.py 2011-08-19 22:28:34 +0000 |
225 | @@ -2178,15 +2178,18 @@ |
226 | return file_kind_from_stat_mode(mode) |
227 | file_kind_from_stat_mode = file_kind_from_stat_mode_thunk |
228 | |
229 | - |
230 | -def file_kind(f, _lstat=os.lstat): |
231 | +def file_stat(f, _lstat=os.lstat): |
232 | try: |
233 | - return file_kind_from_stat_mode(_lstat(f).st_mode) |
234 | + # XXX cache? |
235 | + return _lstat(f) |
236 | except OSError, e: |
237 | if getattr(e, 'errno', None) in (errno.ENOENT, errno.ENOTDIR): |
238 | raise errors.NoSuchFile(f) |
239 | raise |
240 | |
241 | +def file_kind(f, _lstat=os.lstat): |
242 | + stat_value = file_stat(f, _lstat) |
243 | + return file_kind_from_stat_mode(stat_value.st_mode) |
244 | |
245 | def until_no_eintr(f, *a, **kw): |
246 | """Run f(*a, **kw), retrying if an EINTR error occurs. |
247 | |
248 | === modified file 'bzrlib/tests/blackbox/test_add.py' |
249 | --- bzrlib/tests/blackbox/test_add.py 2011-07-11 06:47:32 +0000 |
250 | +++ bzrlib/tests/blackbox/test_add.py 2011-08-19 22:28:34 +0000 |
251 | @@ -239,3 +239,27 @@ |
252 | out, err = self.run_bzr(["add", "a", "b"], working_dir=u"\xA7") |
253 | self.assertEquals(out, "adding a\n" "adding b\n") |
254 | self.assertEquals(err, "") |
255 | + |
256 | + def test_add_skip_large_files(self): |
257 | + """Test skipping files larger than add.maximum_file_size""" |
258 | + tree = self.make_branch_and_tree('.') |
259 | + self.build_tree(['small.txt', 'big.txt', 'big2.txt']) |
260 | + self.build_tree_contents([('small.txt', '0\n')]) |
261 | + self.build_tree_contents([('big.txt', '01234567890123456789\n')]) |
262 | + self.build_tree_contents([('big2.txt', '01234567890123456789\n')]) |
263 | + tree.branch.get_config().set_user_option('add.maximum_file_size', 5) |
264 | + out = self.run_bzr('add')[0] |
265 | + results = sorted(out.rstrip('\n').split('\n')) |
266 | + self.assertEquals(['adding small.txt'], |
267 | + results) |
268 | + # named items never skipped, even if over max |
269 | + out, err = self.run_bzr(["add", "big2.txt"]) |
270 | + results = sorted(out.rstrip('\n').split('\n')) |
271 | + self.assertEquals(['adding big2.txt'], |
272 | + results) |
273 | + self.assertEquals(err, "") |
274 | + tree.branch.get_config().set_user_option('add.maximum_file_size', 30) |
275 | + out = self.run_bzr('add')[0] |
276 | + results = sorted(out.rstrip('\n').split('\n')) |
277 | + self.assertEquals(['adding big.txt'], |
278 | + results) |
279 | |
280 | === modified file 'bzrlib/tests/test_config.py' |
281 | --- bzrlib/tests/test_config.py 2011-08-12 14:07:21 +0000 |
282 | +++ bzrlib/tests/test_config.py 2011-08-19 22:28:34 +0000 |
283 | @@ -1035,6 +1035,26 @@ |
284 | # automatically cast to list |
285 | self.assertEqual(['x'], get_list('one_item')) |
286 | |
287 | + def test_get_user_option_as_int_from_SI(self): |
288 | + conf, parser = self.make_config_parser(""" |
289 | +plain = 100 |
290 | +si_k = 5k, |
291 | +si_kb = 5kb, |
292 | +si_m = 5M, |
293 | +si_mb = 5MB, |
294 | +si_g = 5g, |
295 | +si_gb = 5gB, |
296 | +""") |
297 | + get_si = conf.get_user_option_as_int_from_SI |
298 | + self.assertEqual(100, get_si('plain')) |
299 | + self.assertEqual(5000, get_si('si_k')) |
300 | + self.assertEqual(5000, get_si('si_kb')) |
301 | + self.assertEqual(5000000, get_si('si_m')) |
302 | + self.assertEqual(5000000, get_si('si_mb')) |
303 | + self.assertEqual(5000000000, get_si('si_g')) |
304 | + self.assertEqual(5000000000, get_si('si_gb')) |
305 | + self.assertEqual(None, get_si('non-exist')) |
306 | + self.assertEqual(42, get_si('non-exist-with-default', 42)) |
307 | |
308 | class TestSupressWarning(TestIniConfig): |
309 | |
310 | |
311 | === modified file 'doc/en/release-notes/bzr-2.5.txt' |
312 | --- doc/en/release-notes/bzr-2.5.txt 2011-08-18 04:23:06 +0000 |
313 | +++ doc/en/release-notes/bzr-2.5.txt 2011-08-19 22:28:34 +0000 |
314 | @@ -78,6 +78,11 @@ |
315 | * Relative local paths can now be specified in URL syntax by using the |
316 | "file:" prefix. (Jelmer Vernooij) |
317 | |
318 | +* bzr add now skips large files in recursive mode. The default "large" |
319 | + size is 20MB, and is configurable via the add.maximum_file_size |
320 | + option. A value of 0 disables skipping. Named items passed to add are |
321 | + never skipped. (Shannon Weyrick, #54624) |
322 | + |
323 | Improvements |
324 | ************ |
325 | |
326 | @@ -169,6 +174,10 @@ |
327 | no longer support it. |
328 | (Martin Pool) |
329 | |
330 | +* New method ``Config.get_user_option_as_int_from_SI`` added for expanding a |
331 | + value in SI format (i.e. "20MB", "1GB") into its integer equivalent. |
332 | + (Shannon Weyrick) |
333 | + |
334 | * ``Transport`` now has a ``_parsed_url`` attribute instead of |
335 | separate ``_user``, ``_password``, ``_port``, ``_scheme``, ``_host`` |
336 | and ``_path`` attributes. Proxies are provided for the moment but |
1MB might be a little small :)