Merge ~racb/git-ubuntu:nofollow-broken-symlinks into git-ubuntu:master

Proposed by Robie Basak
Status: Merged
Approved by: Robie Basak
Approved revision: 41487afc2d990c52c07c3d849b8e7c6919343189
Merged at revision: 41487afc2d990c52c07c3d849b8e7c6919343189
Proposed branch: ~racb/git-ubuntu:nofollow-broken-symlinks
Merge into: git-ubuntu:master
Diff against target: 57 lines (+20/-3)
2 files modified
gitubuntu/git_repository.py (+12/-3)
gitubuntu/git_repository_test.py (+8/-0)
Reviewer Review Type Date Requested Status
Bryce Harrington Approve
Server Team CI bot continuous-integration Pending
Review via email: mp+380713@code.launchpad.net

Commit message

Make Jenkins happy

To post a comment you must log in.
Revision history for this message
Robie Basak (racb) wrote :

The change to gitubuntu/source-package-whitelist.txt isn't actually part of this branch. This is a Launchpad error in generating the preview.

Revision history for this message
Bryce Harrington (bryce) wrote :

Had to lookup tmpdir, as I hadn't used that before myself. It's basically a py.path rooted in a temporary directory space. Its .join() operation essentially appends the given name, 'foo', as a subdirectory in this area. It doesn't actually create the file, if I understand correctly.

Next we instantiate this file 'foo' as a symlink. 'foo' points to another file which we name 'nonexistent' using .join(), which doesn't exist.

Finally, we attempt to create a RenameableDir object. RenamableDir's constructor does not operate on py.path objects, just simple string filenames, so we extract the path as a string and pass it in. RenameableDir requires this path exists though, but in the case of broken symlinks this triggers an assertion.

Correct the above if my understanding of the details is in error.

---

So, review comments:

1. The RenameableDir docs for __init__ specifies the path "must exist". These should probably be clarified for this case. This would be a good place to explain the distinction for broken symlinks.

The docs for the recursive() property are also describing handling of files, dirs, and symlinks, and would likely be another good place to mention this corner case.

2. Somewhere (I'm not sure where) there should also be an explanation of _why_ we're allowing broken symlinks to be allowed. From the bug report, commit messages, and code, I'm not spotting a explanation for this. I'm not doubting that it's something we need to do to fix this error, just that the paperwork here seem to be missing the justification for allowing broken symlinks to be treated as valid files.

3. Just for added clarity, I'd suggest somewhat different path variable names in the test:

+def test_renameable_dir_recursive_symlink_directory(tmpdir):
+ test_symlink = tmpdir.join('test_symlink')
+ nonexistent_file = tmpdir.join('nonexistent_file')
+ test_symlink.mksymlinkto(nonexistent_file)

I think that'd give the reader a bit more clarity as to what's being done.

4. We've been adding code docs for test cases, I notice they're a bit light in this file, this might be a good opportunity to add one. (Although, I think with the changes in #3 it'd be less necessary)

5. I notice it doesn't look like we have a test case to check the invalid case. I.e. creating a RenameableDir for a non-existent file. This seems pertinent as a pair with this change, and I'd suggest adding it. Guessing its implementation could look something like:

+ nonexistent_file = tmpdir.join('nonexistent_file')
+ with pytest.raises(AssertionError):
+ target.RenameableDir(str(nonexistent_file))

(Better would be if RenameableDir were to throw its own FileDoesntExist exception, I suppose.)

---
Hopefully these review comments are easy enough to tackle, that ended up being a lot of review comments for a short change, sorry about that!

review: Needs Fixing
Revision history for this message
Robie Basak (racb) wrote :
Download full text (3.4 KiB)

On Mon, Mar 16, 2020 at 08:16:35PM -0000, Bryce Harrington wrote:
> Correct the above if my understanding of the details is in error.

No - you've got it exactly.

> So, review comments:
>
> 1. The RenameableDir docs for __init__ specifies the path "must exist". These should probably be clarified for this case. This would be a good place to explain the distinction for broken symlinks.

Done.

> The docs for the recursive() property are also describing handling of files, dirs, and symlinks, and would likely be another good place to mention this corner case.

Done.

> 2. Somewhere (I'm not sure where) there should also be an explanation of _why_ we're allowing broken symlinks to be allowed. From the bug report, commit messages, and code, I'm not spotting a explanation for this. I'm not doubting that it's something we need to do to fix this error, just that the paperwork here seem to be missing the justification for allowing broken symlinks to be treated as valid files.

I've done this by adding to the RenameableDir class docstring.

> 3. Just for added clarity, I'd suggest somewhat different path variable names in the test:
>
> +def test_renameable_dir_recursive_symlink_directory(tmpdir):
> + test_symlink = tmpdir.join('test_symlink')
> + nonexistent_file = tmpdir.join('nonexistent_file')
> + test_symlink.mksymlinkto(nonexistent_file)
>
> I think that'd give the reader a bit more clarity as to what's being done.

Done. That's a great improvement - thank you.

> 4. We've been adding code docs for test cases, I notice they're a bit light in this file, this might be a good opportunity to add one. (Although, I think with the changes in #3 it'd be less necessary)

Added.

> 5. I notice it doesn't look like we have a test case to check the invalid case. I.e. creating a RenameableDir for a non-existent file. This seems pertinent as a pair with this change, and I'd suggest adding it. Guessing its implementation could look something like:
>
> + nonexistent_file = tmpdir.join('nonexistent_file')
> + with pytest.raises(AssertionError):
> + target.RenameableDir(str(nonexistent_file))
>
> (Better would be if RenameableDir were to throw its own FileDoesntExist exception, I suppose.)

I agree that it's appropriate to be defensive here. But I argue that
we're already doing this. The assertion is there, and it works. I accept
it was a false positive in our case that caused the problem; the usual
way to test for that is to add a test that it works for the false
positive case, which is now done.

The specification of the class constructor, via the docstring, says that
the path must exist. In my view, that means that it's explicitly the
caller's responsibility to ensure, and that behaviour if the rule is
violated is then "undefined".

So we don't need to write a test the assertion is raised, since it is
not defined in the constructor specification that it is to be raised.
Such a test wouldn't be verifying defined behaviour. We could modify that
specification and return a proper exception as you say, but I'm not sure
I see the value in adding that. We do want to be defensive and fail if
the caller is wrong, but we're already ...

Read more...

Revision history for this message
Bryce Harrington (bryce) wrote :

Looks good.

I usually try to treat invalid cases as equally important as valid cases for writing tests to. Not just to check correct behavior, but also from a security viewpoint to make sure things fail in an expected fashion when presented with erroneous data. You're right though, that in this case the valid test case, and the run time assert should be sufficient for catching ordinary errors.

Anyway, if it were me, yes I'd still add it (and make the assertion itself documented, and throwing a defined exception rather than generic AssertionError). But I won't press the point if you think otherwise - this is a distinct case from what this patch is aiming to solve, and your patch is good so no reason to delay landing it now.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/gitubuntu/git_repository.py b/gitubuntu/git_repository.py
2index afa4fc5..002fa28 100644
3--- a/gitubuntu/git_repository.py
4+++ b/gitubuntu/git_repository.py
5@@ -223,13 +223,21 @@ class RenameableDir:
6 setter to handle the rename and replacement wrapped string object
7 transparently. This moves complexity away from the class consumer, allowing
8 the consumer to be tested more easily.
9+
10+ Since the underlying purpose of this class is to handle manipulations of a
11+ directory tree for adjustments needed during import/export, symlink
12+ handling is effectively "turned off" in the specification of this class.
13+ Symlinks to directories are not recursed into; they are handled no
14+ differently to a regular file, in the same manner as lstat(2).
15 """
16 def __init__(self, path):
17 """Create a new RenameableDir instance.
18
19- :param str path: the on-disk directory to wrap, which must exist.
20+ :param str path: the on-disk directory to wrap, which must exist. For
21+ symlinks, it is the symlink itself that must exist; the existence
22+ of a symlink's target does not matter.
23 """
24- assert os.path.exists(path)
25+ assert os.path.lexists(path)
26 self._path = path
27
28 @property
29@@ -264,7 +272,8 @@ class RenameableDir:
30 An object representing a file will return False. An object representing
31 a directory will return True, even if it is empty.
32
33- Symlinks return False even if they point to a directory.
34+ Symlinks return False even if they point to a directory. Broken
35+ symlinks also always return False.
36
37 :rtype: bool
38 """
39diff --git a/gitubuntu/git_repository_test.py b/gitubuntu/git_repository_test.py
40index fc86094..773c947 100644
41--- a/gitubuntu/git_repository_test.py
42+++ b/gitubuntu/git_repository_test.py
43@@ -303,6 +303,14 @@ def test_renamable_dir_recursive(tmpdir):
44 assert not target.RenameableDir(str(b)).recursive
45
46
47+def test_renameable_dir_recursive_symlink_directory(tmpdir):
48+ """A RenameableDir should not treat a broken symlink as recursive"""
49+ test_symlink = tmpdir.join('foo')
50+ nonexistent_file = tmpdir.join('nonexistent_file')
51+ test_symlink.mksymlinkto(nonexistent_file)
52+ assert not target.RenameableDir(str(test_symlink)).recursive
53+
54+
55 def test_renameable_dir_str(tmpdir):
56 p = tmpdir.join('foo')
57 p.ensure()

Subscribers

People subscribed via source and target branches