Merge into trunk : 057-tailer : Code : juju-core

Status:	Merged
Approved by:	Frank Mueller on 2013-12-13
Approved revision:	no longer in the source branch.
Merged at revision:	2152
Proposed branch:	lp:~themue/juju-core/057-tailer
Merge into:	lp:~go-bot/juju-core/trunk
Diff against target:	767 lines (+752/-0) 3 files modified utils/tailer/export_test.go (+6/-0) utils/tailer/tailer.go (+250/-0) utils/tailer/tailer_test.go (+496/-0)
To merge this branch:	bzr merge lp:~themue/juju-core/057-tailer
Related bugs:	Link a bug report

Reviewer	Review Type	Date Requested	Status
Juju Engineering		2013-12-03	Pending
Review via email: mp+197522@code.launchpad.net

Revision history for this message

Frank Mueller (themue) wrote on 2013-12-03:

#

Reviewers: mp+197522_code.launchpad.net,

Message:
Please take a look.

Description:
utils: added Tailer for tailing of logs in API

The Tailer is the initial component of the debug logging command
of the API. It allows the filtered tailing of any ReaderSeeker.
If no filter is passed all lines will be written in the passed
Writer, otherwise only those where the filter function returns
true. The initial number of lines can also be specified, the
filter already works here. So if a File (which is a ReaderSeeker)
containes 100 lines, 10 lines are wanted and 5 match to the
filter only those 5 lines are returned.

https://code.launchpad.net/~themue/juju-core/057-tailer/+merge/197522

(do not edit description out of merge proposal)

Please review this at https://codereview.appspot.com/36540043/

Affected files (+412, -0 lines):
   A [revision details]
   A utils/tailer.go
   A utils/tailer_test.go

Revision history for this message

Frank Mueller (themue) wrote on 2013-12-05:

#

Please take a look.

https://codereview.appspot.com/36540043/

Revision history for this message

Roger Peppe (rogpeppe) wrote on 2013-12-05:

#

Download full text (5.4 KiB)

This is a good start, but I have a few comments and suggestions.

I wonder if it might sit well inside its own package rather being added
to the grab-bag of stuff in utils.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go
File utils/tailer.go (right):

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode30
utils/tailer.go:30: func StartFileTailer(filename string, lines int,
filter TailerFilterFunc,
I think this function is unnecessary. It's trivial for other code to
open a file.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode40
utils/tailer.go:40: func StartTailer(readSeeker io.ReadSeeker, lines
int, filter TailerFilterFunc,
Please document this function properly.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode87
utils/tailer.go:87: println("> error:", err.Error())
d

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode98
utils/tailer.go:98: if buffer[i] == '\n' {
This isn't strictly accurate (what happens when we have an unterminated
line at the end of the file?) but it's probably ok if we never filter
unterminated lines. A comment about why it's ok might prevent future
puzzling over the correctness of the code.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode100
utils/tailer.go:100: if foundNewlines-1 == t.lines {
s/==/>=/

defensively.

Also, if we move the if statement before the increment,
then you can lose the "-1".

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode127
utils/tailer.go:127: line, err := reader.ReadString('\n')
I think I'd be tempted to use ReadSlice here (changing the type of
filter to func([]byte)bool.
That way the tailer won't need to generate any garbage at all AFAICS.

Unfortunately that limits the line size to the size of the bufio buffer,
which may be acceptable. An alternative is to use ReadLine and allocate
only if the line is too long.

Here's some code I wrote a while ago to do that, in case it might be
useful:

// readLine reads a line from r.
// The returned byte slice is only valid until the
// next read call on r.
func readLine(r *bufio.Reader) ([]byte, error) {
line, isPrefix, err := r.ReadLine()
if !isPrefix {
  return line, err
}
buf := append([]byte(nil), line...)
for isPrefix && err == nil {
  line, isPrefix, err = r.ReadLine()
  buf = append(buf, line...)
}
return buf, err
}

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode147
utils/tailer.go:147: for {
This code is exactly the same as the code above.
Can't we do with just this? (if we just delete the
loop above and use NewTimer(0), I think it might
just work).

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode149
utils/tailer.go:149: if len(line) > 0 {
I think we should ignore the line at this point if it is unterminated,
something that's quite possible if the file is being written in
arbitrary chunks. If we find an unterminated line, we should seek back
to the start of the line before we start reading again.

That means we could pass the line to filter without the trailing \n,
meaning we can be more resilient if something s...

This is a good start, but I have a few comments and suggestions.

I wonder if it might sit well inside its own package rather being added
to the grab-bag of stuff in utils.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go
File utils/tailer.go (right):

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode30
utils/tailer.go:30: func StartFileTailer(filename string, lines int,
filter TailerFilterFunc,
I think this function is unnecessary. It's trivial for other code to
open a file.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode40
utils/tailer.go:40: func StartTailer(readSeeker io.ReadSeeker, lines
int, filter TailerFilterFunc,
Please document this function properly.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode87
utils/tailer.go:87: println("> error:", err.Error())
d

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode98
utils/tailer.go:98: if buffer[i] == '\n' {
This isn't strictly accurate (what happens when we have an unterminated
line at the end of the file?) but it's probably ok if we never filter
unterminated lines. A comment about why it's ok might prevent future
puzzling over the correctness of the code.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode100
utils/tailer.go:100: if foundNewlines-1 == t.lines {
s/==/>=/

defensively.

Also, if we move the if statement before the increment,
then you can lose the "-1".

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode127
utils/tailer.go:127: line, err := reader.ReadString('\n')
I think I'd be tempted to use ReadSlice here (changing the type of
filter to func([]byte)bool.
That way the tailer won't need to generate any garbage at all AFAICS.

Unfortunately that limits the line size to the size of the bufio buffer,
which may be acceptable. An alternative is to use ReadLine and allocate
only if the line is too long.

Here's some code I wrote a while ago to do that, in case it might be
useful:

// readLine reads a line from r.
// The returned byte slice is only valid until the
// next read call on r.
func readLine(r *bufio.Reader) ([]byte, error) {
	line, isPrefix, err := r.ReadLine()
	if !isPrefix {
		return line, err
	}
	buf := append([]byte(nil), line...)
	for isPrefix && err == nil {
		line, isPrefix, err = r.ReadLine()
		buf = append(buf, line...)
	}
	return buf, err
}

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode147
utils/tailer.go:147: for {
This code is exactly the same as the code above.
Can't we do with just this? (if we just delete the
loop above and use NewTimer(0), I think it might
just work).

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode149
utils/tailer.go:149: if len(line) > 0 {
I think we should ignore the line at this point if it is unterminated,
something that's quite possible if the file is being written in
arbitrary chunks. If we find an unterminated line, we should seek back
to the start of the line before we start reading again.

That means we could pass the line to filter without the trailing \n,
meaning we can be more resilient if something starts producing lines
ending in \r\n.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode151
utils/tailer.go:151: writer.WriteString(line)
We should definitely check the error here and in Flush below - I don't
think we want to carry on trying to write to a dead network connection
or full filesystem.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode162
utils/tailer.go:162: t.readSeeker.Seek(0, os.SEEK_END)
This is wrong, I think - we could miss data if some has been
written in between seeing the EOF and this seek.
I think the only time we want to seek is if we get an unterminated line.

That said, there's another tricky case - what happens if the file gets
truncated? I wonder if that should be done with separate logic to detect
whether the file has suddenly reduced in size. We'll need separate logic
anyway to decide when to reopen the file when log files roll over, so
that might work out ok.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer_test.go
File utils/tailer_test.go (right):

https://codereview.appspot.com/36540043/diff/20001/utils/tailer_test.go#newcode25
utils/tailer_test.go:25: buffer := bytes.NewBuffer([]byte{})
var buffer bytes.Buffer

is simpler and does almost the same thing.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer_test.go#newcode124
utils/tailer_test.go:124: func assertCollected(c *gc.C, buffer
*bytes.Buffer, collected []string, addon func([]string), timeout bool) {
Please could we have a comment describing what this function is doing,
please?

https://codereview.appspot.com/36540043/diff/20001/utils/tailer_test.go#newcode162
utils/tailer_test.go:162: var rs *readSeeker = new(readSeeker)
var rs readSeeker

https://codereview.appspot.com/36540043/diff/20001/utils/tailer_test.go#newcode206
utils/tailer_test.go:206: if r.pos == len(r.buffer) {
I think we can usefully use the result of copy here:

if r.pos >= len(r.buffer) {
    return 0, io.EOF
}

n := copy(p, r.buffer[r.pos:])
r.pos += n
return n, nil

https://codereview.appspot.com/36540043/diff/20001/utils/tailer_test.go#newcode244
utils/tailer_test.go:244: func signal(c *gc.C, sigc chan struct{}) {
Given that the signal channel is buffered, can't we just send on the
channel?

https://codereview.appspot.com/36540043/

Revision history for this message

William Reade (fwereade) wrote on 2013-12-09:

#

WIP in response to rog's review

Revision history for this message

Frank Mueller (themue) wrote on 2013-12-10:

#

Download full text (6.6 KiB)

Please take a look.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go
File utils/tailer.go (right):

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode30
utils/tailer.go:30: func StartFileTailer(filename string, lines int,
filter TailerFilterFunc,
On 2013/12/05 16:20:37, rog wrote:
> I think this function is unnecessary. It's trivial for other code to
open a
> file.

Done.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode40
utils/tailer.go:40: func StartTailer(readSeeker io.ReadSeeker, lines
int, filter TailerFilterFunc,
On 2013/12/05 16:20:37, rog wrote:
> Please document this function properly.

Done.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode87
utils/tailer.go:87: println("> error:", err.Error())
On 2013/12/05 16:20:37, rog wrote:
> d

Done.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode98
utils/tailer.go:98: if buffer[i] == '\n' {
On 2013/12/05 16:20:37, rog wrote:
> This isn't strictly accurate (what happens when we have an
unterminated line at
> the end of the file?) but it's probably ok if we never filter
unterminated
> lines. A comment about why it's ok might prevent future puzzling over
the
> correctness of the code.

Added a comment at the type declaration.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode100
utils/tailer.go:100: if foundNewlines-1 == t.lines {
On 2013/12/05 16:20:37, rog wrote:
> s/==/>=/

> defensively.

> Also, if we move the if statement before the increment,
> then you can lose the "-1".

Done.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode127
utils/tailer.go:127: line, err := reader.ReadString('\n')
On 2013/12/05 16:20:37, rog wrote:
> I think I'd be tempted to use ReadSlice here (changing the type of
filter to
> func([]byte)bool.
> That way the tailer won't need to generate any garbage at all AFAICS.

> Unfortunately that limits the line size to the size of the bufio
buffer, which
> may be acceptable. An alternative is to use ReadLine and allocate only
if the
> line is too long.

> Here's some code I wrote a while ago to do that, in case it might be
useful:

> // readLine reads a line from r.
> // The returned byte slice is only valid until the
> // next read call on r.
> func readLine(r *bufio.Reader) ([]byte, error) {
> line, isPrefix, err := r.ReadLine()
> if !isPrefix {
> return line, err
> }
> buf := append([]byte(nil), line...)
> for isPrefix && err == nil {
> line, isPrefix, err = r.ReadLine()
> buf = append(buf, line...)
> }
> return buf, err
> }

Good hint, using it.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode147
utils/tailer.go:147: for {
On 2013/12/05 16:20:37, rog wrote:
> This code is exactly the same as the code above.
> Can't we do with just this? (if we just delete the
> loop above and use NewTimer(0), I think it might
> just work).

Yep, the split has been due to the initial approach of scanning all up
to the end first. Removed. Thanks.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode149
utils/tailer.go:149: if len(line) > 0 {
On 2013/12/0...

Please take a look.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go
File utils/tailer.go (right):

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode30
utils/tailer.go:30: func StartFileTailer(filename string, lines int,
filter TailerFilterFunc,
On 2013/12/05 16:20:37, rog wrote:
> I think this function is unnecessary. It's trivial for other code to
open a
> file.

Done.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode40
utils/tailer.go:40: func StartTailer(readSeeker io.ReadSeeker, lines
int, filter TailerFilterFunc,
On 2013/12/05 16:20:37, rog wrote:
> Please document this function properly.

Done.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode87
utils/tailer.go:87: println("> error:", err.Error())
On 2013/12/05 16:20:37, rog wrote:
> d

Done.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode98
utils/tailer.go:98: if buffer[i] == '\n' {
On 2013/12/05 16:20:37, rog wrote:
> This isn't strictly accurate (what happens when we have an
unterminated line at
> the end of the file?) but it's probably ok if we never filter
unterminated
> lines. A comment about why it's ok might prevent future puzzling over
the
> correctness of the code.

Added a comment at the type declaration.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode100
utils/tailer.go:100: if foundNewlines-1 == t.lines {
On 2013/12/05 16:20:37, rog wrote:
> s/==/>=/

> defensively.

> Also, if we move the if statement before the increment,
> then you can lose the "-1".

Done.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode127
utils/tailer.go:127: line, err := reader.ReadString('\n')
On 2013/12/05 16:20:37, rog wrote:
> I think I'd be tempted to use ReadSlice here (changing the type of
filter to
> func([]byte)bool.
> That way the tailer won't need to generate any garbage at all AFAICS.

> Unfortunately that limits the line size to the size of the bufio
buffer, which
> may be acceptable. An alternative is to use ReadLine and allocate only
if the
> line is too long.

> Here's some code I wrote a while ago to do that, in case it might be
useful:

> // readLine reads a line from r.
> // The returned byte slice is only valid until the
> // next read call on r.
> func readLine(r *bufio.Reader) ([]byte, error) {
> 	line, isPrefix, err := r.ReadLine()
> 	if !isPrefix {
> 		return line, err
> 	}
> 	buf := append([]byte(nil), line...)
> 	for isPrefix && err == nil {
> 		line, isPrefix, err = r.ReadLine()
> 		buf = append(buf, line...)
> 	}
> 	return buf, err
> }

Good hint, using it.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode147
utils/tailer.go:147: for {
On 2013/12/05 16:20:37, rog wrote:
> This code is exactly the same as the code above.
> Can't we do with just this? (if we just delete the
> loop above and use NewTimer(0), I think it might
> just work).

Yep, the split has been due to the initial approach of scanning all up
to the end first. Removed. Thanks.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode149
utils/tailer.go:149: if len(line) > 0 {
On 2013/12/05 16:20:37, rog wrote:
> I think we should ignore the line at this point if it is unterminated,
something
> that's quite possible if the file is being written in arbitrary
chunks. If we
> find an unterminated line, we should seek back to the start of the
line before
> we start reading again.

> That means we could pass the line to filter without the trailing \n,
meaning we
> can be more resilient if something starts producing lines ending in
\r\n.

Found a solution for unterminated lines, but currently ignoring possible
\r\n and passing the termination to the filter too. For the work with
other delimiters than \n more changes are needed.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode151
utils/tailer.go:151: writer.WriteString(line)
On 2013/12/05 16:20:37, rog wrote:
> We should definitely check the error here and in Flush below - I don't
think we
> want to carry on trying to write to a dead network connection or full
> filesystem.

Done.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode162
utils/tailer.go:162: t.readSeeker.Seek(0, os.SEEK_END)
On 2013/12/05 16:20:37, rog wrote:
> This is wrong, I think - we could miss data if some has been
> written in between seeing the EOF and this seek.
> I think the only time we want to seek is if we get an unterminated
line.

> That said, there's another tricky case - what happens if the file gets
> truncated? I wonder if that should be done with separate logic to
detect whether
> the file has suddenly reduced in size. We'll need separate logic
anyway to
> decide when to reopen the file when log files roll over, so that might
work out
> ok.

Removed the seek and keeping the truncation in mind.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer_test.go
File utils/tailer_test.go (right):

https://codereview.appspot.com/36540043/diff/20001/utils/tailer_test.go#newcode25
utils/tailer_test.go:25: buffer := bytes.NewBuffer([]byte{})
On 2013/12/05 16:20:37, rog wrote:
> var buffer bytes.Buffer

> is simpler and does almost the same thing.

Almost, but I need the pointer (for io.Writer in NewTailer() and in the
assertion).

https://codereview.appspot.com/36540043/diff/20001/utils/tailer_test.go#newcode124
utils/tailer_test.go:124: func assertCollected(c *gc.C, buffer
*bytes.Buffer, collected []string, addon func([]string), timeout bool) {
On 2013/12/05 16:20:37, rog wrote:
> Please could we have a comment describing what this function is doing,
please?

Done.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer_test.go#newcode162
utils/tailer_test.go:162: var rs *readSeeker = new(readSeeker)
On 2013/12/05 16:20:37, rog wrote:
> var rs readSeeker

Done.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer_test.go#newcode206
utils/tailer_test.go:206: if r.pos == len(r.buffer) {
On 2013/12/05 16:20:37, rog wrote:
> I think we can usefully use the result of copy here:

> if r.pos >= len(r.buffer) {
>     return 0, io.EOF
> }

> n := copy(p, r.buffer[r.pos:])
> r.pos += n
> return n, nil

Awesome, totally forgot the return value of copy. *sigh* Have to get
better regarding slice operations.

https://codereview.appspot.com/36540043/diff/20001/utils/tailer_test.go#newcode244
utils/tailer_test.go:244: func signal(c *gc.C, sigc chan struct{}) {
On 2013/12/05 16:20:37, rog wrote:
> Given that the signal channel is buffered, can't we just send on the
channel?

Oh, yes, that's left from an older approach.

https://codereview.appspot.com/36540043/

Revision history for this message

Roger Peppe (rogpeppe) wrote on 2013-12-10:

#

Download full text (5.6 KiB)

Getting there! A few more thoughts and suggestions below.

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go
File utils/tailer/tailer.go (right):

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode41
utils/tailer/tailer.go:41: // Writer. The reading beginns the specified
number of matching lines
s/beginns/begins/

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode43
utils/tailer/tailer.go:43: func NewStandardTailer(readSeeker
io.ReadSeeker, writer io.Writer, lines int, filter TailerFilterFunc)
*Tailer {
Given that this function is going to be the one that everyone calls, I
think I'd name it NewTailer, but do we actually need to expose buffer
size and poll interval publicly except to the tailer tests?

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode126
utils/tailer/tailer.go:126: readBuffer := make([]byte, t.bufsize)
This seems rather complex to me - I don't fully trust myself to vet the
logic. Also, I don't think we need to have a separate line buffer and
read buffer - we can use a single buffer and read directly into it. That
way we can avoid allocating a new buffer every time round the loop too,
although it is necessary to copy something from the start to the end.

I started trying to explain a better possiblity, but got carried away...
Here's the kind of thing I mean. Still somewhat complex, but I've tried
to keep the invariants simple. It needs testing (I've left in
one deliberate mistake that I hope to see a test for). I suggest
some internal tests that test this function in isolation with
quite a few different inputs.

// seekLastLines sets the read position of the ReadSeeker to the
// wanted number of filtered lines before the end.
func (t *Tailer) seekLastLines() error {
offset, err := t.readSeeker.Seek(0, os.SEEK_END)
if err != nil {
  return err
}
seekPos := int64(0)
found := 0
buf := make([]byte, minRead)
SeekLoop:
for offset > 0 {
  // buf contains the data left over from the
  // previous iteration.
  space := cap(buf) - len(buf)
  if space < minRead {
   // grow buffer
   newBuf := make([]byte, len(buf), cap(buf)*2)
   copy(newBuf, buf)
   buf = newBuf
   space = cap(buf) - len(buf)

  }
  if int64(space) > offset {
   // Use exactly the right amount of space if there's
   // only a small amount remaining.
   space = int(offset)
  }
  // copy data remaining from last time to the end of the buffer,
  // so we can read into the right place.
  copy(buf[space:cap(buf)], buf)
  buf = buf[0 : len(buf)+space]

  offset -= int64(space)
  _, err := t.readSeeker.Seek(offset, os.SEEK_SET)
  if err != nil {
   return err
  }
  _, err = io.ReadFull(t.readSeeker, buf[0:space])
  if err != nil {
   return err
  }
  // Find the end of the last line in the buffer.
  // This will discard any unterminated line at the end
  // of the file.
  end := bytes.LastIndex(buf, delim)
  if end == -1 {
   // No end of line found - discard incomplete
   // line and continue looking. If this happens
   // at the beginning of the file, we don't care
   // because we're going to stop anyway.
   buf = buf[:0]
   continue
  }
  end++
  fo...

Getting there! A few more thoughts and suggestions below.

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go
File utils/tailer/tailer.go (right):

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode41
utils/tailer/tailer.go:41: // Writer. The reading beginns the specified
number of matching lines
s/beginns/begins/

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode43
utils/tailer/tailer.go:43: func NewStandardTailer(readSeeker
io.ReadSeeker, writer io.Writer, lines int, filter TailerFilterFunc)
*Tailer {
Given that this function is going to be the one that everyone calls, I
think I'd name it NewTailer, but do we actually need to expose buffer
size and poll interval publicly except to the tailer tests?

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode126
utils/tailer/tailer.go:126: readBuffer := make([]byte, t.bufsize)
This seems rather complex to me - I don't fully trust myself to vet the
logic. Also, I don't think we need to have a separate line buffer and
read buffer - we can use a single buffer and read directly into it. That
way we can avoid allocating a new buffer every time round the loop too,
although it is necessary to copy something from the start to the end.

I started trying to explain a better possiblity, but got carried away...
Here's the kind of thing I mean. Still somewhat complex, but I've tried
to keep the invariants simple. It needs testing (I've left in
one deliberate mistake that I hope to see a test for). I suggest
some internal tests that test this function in isolation with
quite a few different inputs.

// seekLastLines sets the read position of the ReadSeeker to the
// wanted number of filtered lines before the end.
func (t *Tailer) seekLastLines() error {
	offset, err := t.readSeeker.Seek(0, os.SEEK_END)
	if err != nil {
		return err
	}
	seekPos := int64(0)
	found := 0
	buf := make([]byte, minRead)
SeekLoop:
	for offset > 0 {
		// buf contains the data left over from the
		// previous iteration.
		space := cap(buf) - len(buf)
		if space < minRead {
			// grow buffer
			newBuf := make([]byte, len(buf), cap(buf)*2)
			copy(newBuf, buf)
			buf = newBuf
			space = cap(buf) - len(buf)

}
		if int64(space) > offset {
			// Use exactly the right amount of space if there's
			// only a small amount remaining.
			space = int(offset)
		}
		// copy data remaining from last time to the end of the buffer,
		// so we can read into the right place.
		copy(buf[space:cap(buf)], buf)
		buf = buf[0 : len(buf)+space]

offset -= int64(space)
		_, err := t.readSeeker.Seek(offset, os.SEEK_SET)
		if err != nil {
			return err
		}
		_, err = io.ReadFull(t.readSeeker, buf[0:space])
		if err != nil {
			return err
		}
		// Find the end of the last line in the buffer.
		// This will discard any unterminated line at the end
		// of the file.
		end := bytes.LastIndex(buf, delim)
		if end == -1 {
			// No end of line found - discard incomplete
			// line and continue looking. If this happens
			// at the beginning of the file, we don't care
			// because we're going to stop anyway.
			buf = buf[:0]
			continue
		}
		end++
		for {
			start := bytes.LastIndex(buf[0:end-1], delim)
			if start == -1 {
				break
			}
			start++
			if t.isValid(buf[start:end]) {
				found++
				if found >= t.lines {
					seekPos = offset + int64(start)
					break SeekLoop
				}
			}
			end = start - 1
		}
		// Leave the last line in buf, as we don't know whether
		// it's complete or not.
		buf = buf[0:end]
	}
	// Final positioning.
	t.readSeeker.Seek(seekPos, os.SEEK_SET)
	return nil
}

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode198
utils/tailer/tailer.go:198: // Reached beginnig of data.
s/beginnig/beginning/

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode212
utils/tailer/tailer.go:212: buffer, err := t.reader.ReadBytes(delimiter)
ReadBytes can't return ErrBufferFull.

I think you want to use ReadSlice.
Also, if the first call succeeds, we should just return the slice that
we got from ReadSlice - that way in the usual case that the line is
shorter than the bufio buffer we can avoid any allocation.

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode226
utils/tailer/tailer.go:226: if len(line) == 0 || line[len(line)-1] ==
delimiter {
This can't happen. (from the docs: "returns err != nil if and only if
the returned data does not end in delim")

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode234
utils/tailer/tailer.go:234: t.readSeeker.Seek(offset, os.SEEK_END)
This is racy. We can't seek from the end because the end might be
constantly changing. I think we need to keep track of where we're
reading in the file and seek back to the absolute offset of the start of
the partial line.

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode247
utils/tailer/tailer.go:247: return false
Why?

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode252
utils/tailer/tailer.go:252: return t.filter(line)
Consider trimming \r?\n from the end of the line before calling filter?

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer_test.go
File utils/tailer/tailer_test.go (right):

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer_test.go#newcode4
utils/tailer/tailer_test.go:4: package tailer_test
I'd like to see some unit tests of seekLastLines in here. There's a lot
of logic in there that is untested here (with respect to long lines,
line boundaries, read errors etc)

https://codereview.appspot.com/36540043/

Revision history for this message

William Reade (fwereade) wrote on 2013-12-11:

#

WIPping to address rog's review

Revision history for this message

Frank Mueller (themue) wrote on 2013-12-11:

#

Download full text (3.3 KiB)

Please take a look.

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go
File utils/tailer/tailer.go (right):

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode41
utils/tailer/tailer.go:41: // Writer. The reading beginns the specified
number of matching lines
On 2013/12/10 15:14:25, rog wrote:
> s/beginns/begins/

Done.

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode43
utils/tailer/tailer.go:43: func NewStandardTailer(readSeeker
io.ReadSeeker, writer io.Writer, lines int, filter TailerFilterFunc)
*Tailer {
On 2013/12/10 15:14:25, rog wrote:
> Given that this function is going to be the one that everyone calls, I
think I'd
> name it NewTailer, but do we actually need to expose buffer size and
poll
> interval publicly except to the tailer tests?

Done.

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode126
utils/tailer/tailer.go:126: readBuffer := make([]byte, t.bufsize)
On 2013/12/10 15:14:25, rog wrote:
> This seems rather complex to me - I don't fully trust myself to vet
the logic.

Used it, even if I have the same troubles following your array logic and
the needed debugging. Thankfully the comments helped to follow the idea.
And at least it has less lines. ;) Thx.

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode198
utils/tailer/tailer.go:198: // Reached beginnig of data.
On 2013/12/10 15:14:25, rog wrote:
> s/beginnig/beginning/

Not needed anymore due to new function.

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode212
utils/tailer/tailer.go:212: buffer, err := t.reader.ReadBytes(delimiter)
On 2013/12/10 15:14:25, rog wrote:
> ReadBytes can't return ErrBufferFull.

> I think you want to use ReadSlice.
> Also, if the first call succeeds, we should just return the slice that
we got
> from ReadSlice - that way in the usual case that the line is shorter
than the
> bufio buffer we can avoid any allocation.

Done.

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode226
utils/tailer/tailer.go:226: if len(line) == 0 || line[len(line)-1] ==
delimiter {
On 2013/12/10 15:14:25, rog wrote:
> This can't happen. (from the docs: "returns err != nil if and only if
the
> returned data does not end in delim")

Done.

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode234
utils/tailer/tailer.go:234: t.readSeeker.Seek(offset, os.SEEK_END)
On 2013/12/10 15:14:25, rog wrote:
> This is racy. We can't seek from the end because the end might be
constantly
> changing. I think we need to keep track of where we're reading in the
file and
> seek back to the absolute offset of the start of the partial line.

Done.

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode247
utils/tailer/tailer.go:247: return false
On 2013/12/10 15:14:25, rog wrote:
> Why?

Yeah, too much automatic filtering here. The receiver of the data should
decide. Removed it.

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode252
utils/tailer/tailer.go:252: return t.filter(line)
On 2013/12/10 15:14...

Please take a look.

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go
File utils/tailer/tailer.go (right):

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode41
utils/tailer/tailer.go:41: // Writer. The reading beginns the specified
number of matching lines
On 2013/12/10 15:14:25, rog wrote:
> s/beginns/begins/

Done.

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode43
utils/tailer/tailer.go:43: func NewStandardTailer(readSeeker
io.ReadSeeker, writer io.Writer, lines int, filter TailerFilterFunc)
*Tailer {
On 2013/12/10 15:14:25, rog wrote:
> Given that this function is going to be the one that everyone calls, I
think I'd
> name it NewTailer, but do we actually need to expose buffer size and
poll
> interval publicly except to the tailer tests?

Done.

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode126
utils/tailer/tailer.go:126: readBuffer := make([]byte, t.bufsize)
On 2013/12/10 15:14:25, rog wrote:
> This seems rather complex to me - I don't fully trust myself to vet
the logic.

Used it, even if I have the same troubles following your array logic and
the needed debugging. Thankfully the comments helped to follow the idea.
And at least it has less lines. ;) Thx.

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode198
utils/tailer/tailer.go:198: // Reached beginnig of data.
On 2013/12/10 15:14:25, rog wrote:
> s/beginnig/beginning/

Not needed anymore due to new function.

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode212
utils/tailer/tailer.go:212: buffer, err := t.reader.ReadBytes(delimiter)
On 2013/12/10 15:14:25, rog wrote:
> ReadBytes can't return ErrBufferFull.

> I think you want to use ReadSlice.
> Also, if the first call succeeds, we should just return the slice that
we got
> from ReadSlice - that way in the usual case that the line is shorter
than the
> bufio buffer we can avoid any allocation.

Done.

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode226
utils/tailer/tailer.go:226: if len(line) == 0 || line[len(line)-1] ==
delimiter {
On 2013/12/10 15:14:25, rog wrote:
> This can't happen. (from the docs: "returns err != nil if and only if
the
> returned data does not end in delim")

Done.

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode234
utils/tailer/tailer.go:234: t.readSeeker.Seek(offset, os.SEEK_END)
On 2013/12/10 15:14:25, rog wrote:
> This is racy. We can't seek from the end because the end might be
constantly
> changing. I think we need to keep track of where we're reading in the
file and
> seek back to the absolute offset of the start of the partial line.

Done.

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode247
utils/tailer/tailer.go:247: return false
On 2013/12/10 15:14:25, rog wrote:
> Why?

Yeah, too much automatic filtering here. The receiver of the data should
decide. Removed it.

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode252
utils/tailer/tailer.go:252: return t.filter(line)
On 2013/12/10 15:14:25, rog wrote:
> Consider trimming \r?\n from the end of the line before calling
filter?

Why?

https://codereview.appspot.com/36540043/

Revision history for this message

Roger Peppe (rogpeppe) wrote on 2013-12-11:

#

Download full text (3.8 KiB)

Another round of suggestions, with one or two things still to fix.
Thanks for bearing with me!

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go
File utils/tailer/tailer.go (right):

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode252
utils/tailer/tailer.go:252: return t.filter(line)
On 2013/12/11 17:13:05, mue wrote:
> On 2013/12/10 15:14:25, rog wrote:
> > Consider trimming \r?\n from the end of the line before calling
filter?

> Why?

Then the filter functions don't need to worry about line termination
characters at all and we know that we're safe even when we have windows
stuff generating \r\n lines.

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer.go
File utils/tailer/tailer.go (right):

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer.go#newcode131
utils/tailer/tailer.go:131: buffer := make([]byte, t.bufsize)
If we've got t.bufsize, "buf" would seem logical as a name.
Not that it matters much though.

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer.go#newcode211
utils/tailer/tailer.go:211: line = append(line, slice...)
Unfortunately I think this is wrong, as we may be appending to the
internal bufio.Reader buffer, which may be overwritten by the ReadSlice
call.

How about separating the first-slice case from the appending logic?

Something like this, perhaps?

// readLine reads the next valid line from the reader, even if it is
// larger than the reader buffer.
func (t *Tailer) readLine() ([]byte, error) {
for {
  slice, err := t.reader.ReadSlice(delimiter)
  if err == nil {
   if t.isValid(slice) {
    return slice, nil
   }
   continue
  }
  line := append([]byte(nil), slice...)
  for err == bufio.ErrBufferFull {
   slice, err = t.reader.ReadSlice(delimiter)
   line = append(line, slice...)
  }
  switch err {
  case nil:
   if t.isValid(line) {
    return line, nil
   }
  case io.EOF:
   // EOF without delimiter, step back.
   t.readSeeker.Seek(-int64(len(line)), os.SEEK_CUR)
   return nil, err
  default:
   return nil, err
  }
}
}

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer.go#newcode224
utils/tailer/tailer.go:224: t.readSeeker.Seek(-offset, os.SEEK_CUR)
Ah, that's nicer than my suggestion, thanks.

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.go
File utils/tailer/tailer_test.go (right):

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.go#newcode30
utils/tailer/tailer_test.go:30: buffer := bytes.NewBuffer(nil)
I'm afraid the way you're using this is racy (and in all the other tests
too) because you're reading from the buffer (with assertCollected)
concurrently with the tailer writing to it.

You could use io.Pipe instead of bytes.Buffer to avoid the problem.

BTW whenever you've got goroutine-based code, it's worth running go test
-race to check this kind of stuff.

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.go#newcode47
utils/tailer/tailer_test.go:47: func (tailerSuite)
TestLaggedTermination(c *gc.C) {
I see 10 of these tests are almost identical.
This suggests to me that a table-based te...

Another round of suggestions, with one or two things still to fix.
Thanks for bearing with me!

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go
File utils/tailer/tailer.go (right):

https://codereview.appspot.com/36540043/diff/40001/utils/tailer/tailer.go#newcode252
utils/tailer/tailer.go:252: return t.filter(line)
On 2013/12/11 17:13:05, mue wrote:
> On 2013/12/10 15:14:25, rog wrote:
> > Consider trimming \r?\n from the end of the line before calling
filter?

> Why?

Then the filter functions don't need to worry about line termination
characters at all and we know that we're safe even when we have windows
stuff generating \r\n lines.

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer.go
File utils/tailer/tailer.go (right):

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer.go#newcode131
utils/tailer/tailer.go:131: buffer := make([]byte, t.bufsize)
If we've got t.bufsize, "buf" would seem logical as a name.
Not that it matters much though.

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer.go#newcode211
utils/tailer/tailer.go:211: line = append(line, slice...)
Unfortunately I think this is wrong, as we may be appending to the
internal bufio.Reader buffer, which may be overwritten by the ReadSlice
call.

How about separating the first-slice case from the appending logic?

Something like this, perhaps?

// readLine reads the next valid line from the reader, even if it is
// larger than the reader buffer.
func (t *Tailer) readLine() ([]byte, error) {
	for {
		slice, err := t.reader.ReadSlice(delimiter)
		if err == nil {
			if t.isValid(slice) {
				return slice, nil
			}
			continue
		}
		line := append([]byte(nil), slice...)
		for err == bufio.ErrBufferFull {
			slice, err = t.reader.ReadSlice(delimiter)
			line = append(line, slice...)
		}
		switch err {
		case nil:
			if t.isValid(line) {
				return line, nil
			}
		case io.EOF:
			// EOF without delimiter, step back.
			t.readSeeker.Seek(-int64(len(line)), os.SEEK_CUR)
			return nil, err
		default:
			return nil, err
		}
	}
}

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer.go#newcode224
utils/tailer/tailer.go:224: t.readSeeker.Seek(-offset, os.SEEK_CUR)
Ah, that's nicer than my suggestion, thanks.

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.go
File utils/tailer/tailer_test.go (right):

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.go#newcode30
utils/tailer/tailer_test.go:30: buffer := bytes.NewBuffer(nil)
I'm afraid the way you're using this is racy (and in all the other tests
too) because you're reading from the buffer (with assertCollected)
concurrently with the tailer writing to it.

You could use io.Pipe instead of bytes.Buffer to avoid the problem.

BTW whenever you've got goroutine-based code, it's worth running go test
-race to check this kind of stuff.

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.go#newcode47
utils/tailer/tailer_test.go:47: func (tailerSuite)
TestLaggedTermination(c *gc.C) {
I see 10 of these tests are almost identical.
This suggests to me that a table-based test might work well here, and
make it easier to see what's actually being tested.

Then we could easily add quite a few more tests (for example, I'd like
to see tests with blank lines in various places, and probably some more
too).

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.go#newcode264
utils/tailer/tailer_test.go:264: disturber := func(lines []string) {
This should probably be defined just before it's used.

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.go#newcode331
utils/tailer/tailer_test.go:331: sigc <- struct{}{}
What's the point of this?
We know the data has been written when this function returns.

https://codereview.appspot.com/36540043/

Revision history for this message

Frank Mueller (themue) wrote on 2013-12-12:

#

Download full text (3.4 KiB)

Please take a look.

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer.go
File utils/tailer/tailer.go (right):

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer.go#newcode131
utils/tailer/tailer.go:131: buffer := make([]byte, t.bufsize)
On 2013/12/11 18:27:41, rog wrote:
> If we've got t.bufsize, "buf" would seem logical as a name.
> Not that it matters much though.

Ack, but in the other direction. ;)

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer.go#newcode211
utils/tailer/tailer.go:211: line = append(line, slice...)
On 2013/12/11 18:27:41, rog wrote:
> Unfortunately I think this is wrong, as we may be appending to the
internal
> bufio.Reader buffer, which may be overwritten by the ReadSlice call.

> How about separating the first-slice case from the appending logic?

> Something like this, perhaps?

> // readLine reads the next valid line from the reader, even if it is
> // larger than the reader buffer.
> func (t *Tailer) readLine() ([]byte, error) {
> for {
> slice, err := t.reader.ReadSlice(delimiter)
> if err == nil {
> if t.isValid(slice) {
> return slice, nil
> }
> continue
> }
> line := append([]byte(nil), slice...)
> for err == bufio.ErrBufferFull {
> slice, err = t.reader.ReadSlice(delimiter)
> line = append(line, slice...)
> }
> switch err {
> case nil:
> if t.isValid(line) {
> return line, nil
> }
> case io.EOF:
> // EOF without delimiter, step back.
> t.readSeeker.Seek(-int64(len(line)), os.SEEK_CUR)
> return nil, err
> default:
> return nil, err
> }
> }
> }

Yeah, that's better, thx.

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.go
File utils/tailer/tailer_test.go (right):

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.go#newcode30
utils/tailer/tailer_test.go:30: buffer := bytes.NewBuffer(nil)
On 2013/12/11 18:27:41, rog wrote:
> I'm afraid the way you're using this is racy (and in all the other
tests too)
> because you're reading from the buffer (with assertCollected)
concurrently with
> the tailer writing to it.

> You could use io.Pipe instead of bytes.Buffer to avoid the problem.

> BTW whenever you've got goroutine-based code, it's worth running go
test -race
> to check this kind of stuff.

Done.

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.go#newcode47
utils/tailer/tailer_test.go:47: func (tailerSuite)
TestLaggedTermination(c *gc.C) {
On 2013/12/11 18:27:41, rog wrote:
> I see 10 of these tests are almost identical.
> This suggests to me that a table-based test might work well here, and
make it
> easier to see what's actually being tested.

> Then we could easily add quite a few more tests (for example, I'd like
to see
> tests with blank lines in various places, and probably some more too).

Done.

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.go#newcode264
utils/tailer/tailer_test.go:264: disturber := func(lines []string) {
On 2013/12/11 18:27:41, rog wrote:
> This should probably be defined just before it's used.

Now different approach with table driven tests.

http...

Please take a look.

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer.go
File utils/tailer/tailer.go (right):

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer.go#newcode131
utils/tailer/tailer.go:131: buffer := make([]byte, t.bufsize)
On 2013/12/11 18:27:41, rog wrote:
> If we've got t.bufsize, "buf" would seem logical as a name.
> Not that it matters much though.

Ack, but in the other direction. ;)

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer.go#newcode211
utils/tailer/tailer.go:211: line = append(line, slice...)
On 2013/12/11 18:27:41, rog wrote:
> Unfortunately I think this is wrong, as we may be appending to the
internal
> bufio.Reader buffer, which may be overwritten by the ReadSlice call.

> How about separating the first-slice case from the appending logic?

> Something like this, perhaps?

> // readLine reads the next valid line from the reader, even if it is
> // larger than the reader buffer.
> func (t *Tailer) readLine() ([]byte, error) {
> 	for {
> 		slice, err := t.reader.ReadSlice(delimiter)
> 		if err == nil {
> 			if t.isValid(slice) {
> 				return slice, nil
> 			}
> 			continue
> 		}
> 		line := append([]byte(nil), slice...)
> 		for err == bufio.ErrBufferFull {
> 			slice, err = t.reader.ReadSlice(delimiter)
> 			line = append(line, slice...)
> 		}
> 		switch err {
> 		case nil:
> 			if t.isValid(line) {
> 				return line, nil
> 			}
> 		case io.EOF:
> 			// EOF without delimiter, step back.
> 			t.readSeeker.Seek(-int64(len(line)), os.SEEK_CUR)
> 			return nil, err
> 		default:
> 			return nil, err
> 		}
> 	}
> }

Yeah, that's better, thx.

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.go
File utils/tailer/tailer_test.go (right):

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.go#newcode30
utils/tailer/tailer_test.go:30: buffer := bytes.NewBuffer(nil)
On 2013/12/11 18:27:41, rog wrote:
> I'm afraid the way you're using this is racy (and in all the other
tests too)
> because you're reading from the buffer (with assertCollected)
concurrently with
> the tailer writing to it.

> You could use io.Pipe instead of bytes.Buffer to avoid the problem.

> BTW whenever you've got goroutine-based code, it's worth running go
test -race
> to check this kind of stuff.

Done.

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.go#newcode47
utils/tailer/tailer_test.go:47: func (tailerSuite)
TestLaggedTermination(c *gc.C) {
On 2013/12/11 18:27:41, rog wrote:
> I see 10 of these tests are almost identical.
> This suggests to me that a table-based test might work well here, and
make it
> easier to see what's actually being tested.

> Then we could easily add quite a few more tests (for example, I'd like
to see
> tests with blank lines in various places, and probably some more too).

Done.

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.go#newcode264
utils/tailer/tailer_test.go:264: disturber := func(lines []string) {
On 2013/12/11 18:27:41, rog wrote:
> This should probably be defined just before it's used.

Now different approach with table driven tests.

https://codereview.appspot.com/36540043/diff/60001/utils/tailer/tailer_test.go#newcode331
utils/tailer/tailer_test.go:331: sigc <- struct{}{}
On 2013/12/11 18:27:41, rog wrote:
> What's the point of this?
> We know the data has been written when this function returns.

Ouch, yes.

https://codereview.appspot.com/36540043/

Revision history for this message

Roger Peppe (rogpeppe) wrote on 2013-12-12:

#

Download full text (3.7 KiB)

Another round. Hopefully done after this one!

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer.go
File utils/tailer/tailer.go (right):

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer.go#newcode49
utils/tailer/tailer.go:49: func NewTailer(readSeeker io.ReadSeeker,
writeCloser io.WriteCloser, lines int, filter TailerFilterFunc) *Tailer
{
I don't think there's any particular reason we want to give this code
the responsibility for closing the writer. It's quite possible we might
have several tailers writing to the same writer, for example.

Better would be a Wait method (and perhaps a Dead method) to wait until
it's finished. Then the caller has the option of closing the writer.

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go
File utils/tailer/tailer_test.go (right):

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go#newcode40
utils/tailer/tailer_test.go:40: {
If we move this brace onto the previous line, we can save a level of
indentation in all these tests.

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go#newcode42
utils/tailer/tailer_test.go:42: data: data[26:29],
I like the new tests much better now. One thought though:

I'm not convinced that referring to "data" in all these tests makes them
more readable - I have no idea what's in data[26:29] without counting or
using grep -n.

Something like the below shows me exactly what's going
on without me needing to refer to multiple places,
and is independent of any changes else in the code.

It would make it considerably easier for me to scan down
the tests to ensure that each one represents reasonable
behaviour.

{
     description: "lines are longer than buffer size",
     data: []string{
         "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n",
         "0123456789012345678901234567890123456789012345678901\n",
         "the quick brown fox ",
     },
     initialLinesWritten: 1,
     initialLinesRequested: 1,
     bufferSize: 5,
     initialCollectedData: []string{
         "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n",
     },
     appendedCollectedData: []string{
         "0123456789012345678901234567890123456789012345678901\n",
     },
},

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go#newcode215
utils/tailer/tailer_test.go:215: line, err := buffer.ReadString('\n')
If the tailer doesn't produce enough data, we'll block here forever I
think, regardless of the timeout.

I think you probably want something more like the below:

func assertCollected(c *gc.C, reader io.Reader, compare []string,
injection func([]string)
      ) {
    lineChan := make(chan string)
    go func() {
        defer close(lineChan)
        reader := bufio.NewReader(reader)
        for {
            line, err := reader.ReadString('\n')
            if !c.Check(err, gc.IsNil) {
                return
            }
            lineChan <- line
        }
    }()
    timeout := time.After(testing.LongWait)
    for {
         select {
         case line, ok := <-lineChan:
              c.Assert(ok, jc.IsTrue)
...

Another round. Hopefully done after this one!

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer.go
File utils/tailer/tailer.go (right):

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer.go#newcode49
utils/tailer/tailer.go:49: func NewTailer(readSeeker io.ReadSeeker,
writeCloser io.WriteCloser, lines int, filter TailerFilterFunc) *Tailer
{
I don't think there's any particular reason we want to give this code
the responsibility for closing the writer. It's quite possible we might
have several tailers writing to the same writer, for example.

Better would be a Wait method (and perhaps a Dead method) to wait until
it's finished. Then the caller has the option of closing the writer.

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go
File utils/tailer/tailer_test.go (right):

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go#newcode40
utils/tailer/tailer_test.go:40: {
If we move this brace onto the previous line, we can save a level of
indentation in all these tests.

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go#newcode42
utils/tailer/tailer_test.go:42: data:                  data[26:29],
I like the new tests much better now. One thought though:

I'm not convinced that referring to "data" in all these tests makes them
more readable - I have no idea what's in data[26:29] without counting or
using grep -n.

Something like the below shows me exactly what's going
on without me needing to refer to multiple places,
and is independent of any changes else in the code.

It would make it considerably easier for me to scan down
the tests to ensure that each one represents reasonable
behaviour.

{
     description: "lines are longer than buffer size",
     data: []string{
         "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n",
         "0123456789012345678901234567890123456789012345678901\n",
         "the quick brown fox ",
     },
     initialLinesWritten:   1,
     initialLinesRequested: 1,
     bufferSize:            5,
     initialCollectedData:  []string{
         "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n",
     },
     appendedCollectedData: []string{
         "0123456789012345678901234567890123456789012345678901\n",
     },
},

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go#newcode215
utils/tailer/tailer_test.go:215: line, err := buffer.ReadString('\n')
If the tailer doesn't produce enough data, we'll block here forever I
think, regardless of the timeout.

I think you probably want something more like the below:

func assertCollected(c *gc.C, reader io.Reader, compare []string,
injection func([]string)
      ) {
    lineChan := make(chan string)
    go func() {
        defer close(lineChan)
        reader := bufio.NewReader(reader)
        for {
            line, err := reader.ReadString('\n')
            if !c.Check(err, gc.IsNil) {
                return
            }
            lineChan <- line
        }
    }()
    timeout := time.After(testing.LongWait)
    for {
         select {
         case line, ok := <-lineChan:
              c.Assert(ok, jc.IsTrue)
              deal with line as currently
         case <-timeout:
               timed out
         }
     }
}

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go#newcode222
utils/tailer/tailer_test.go:222: for i := 0; i < len(lines); i++ {
c.Assert(lines, gc.DeepEquals, compare) ?

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go#newcode228
utils/tailer/tailer_test.go:228: if err == io.EOF {
Once we've got EOF once on a pipe, I don't think we can get any more
data, can we?

https://codereview.appspot.com/36540043/

Revision history for this message

Frank Mueller (themue) wrote on 2013-12-13:

#

Download full text (4.4 KiB)

Please take a look.

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer.go
File utils/tailer/tailer.go (right):

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer.go#newcode49
utils/tailer/tailer.go:49: func NewTailer(readSeeker io.ReadSeeker,
writeCloser io.WriteCloser, lines int, filter TailerFilterFunc) *Tailer
{
On 2013/12/12 14:46:39, rog wrote:
> I don't think there's any particular reason we want to give this code
the
> responsibility for closing the writer. It's quite possible we might
have several
> tailers writing to the same writer, for example.

> Better would be a Wait method (and perhaps a Dead method) to wait
until
> it's finished. Then the caller has the option of closing the writer.

Done as discussed.

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go
File utils/tailer/tailer_test.go (right):

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go#newcode40
utils/tailer/tailer_test.go:40: {
On 2013/12/12 14:46:39, rog wrote:
> If we move this brace onto the previous line, we can save a level of
indentation
> in all these tests.

Done.

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go#newcode42
utils/tailer/tailer_test.go:42: data: data[26:29],
On 2013/12/12 14:46:39, rog wrote:
> I like the new tests much better now. One thought though:

> I'm not convinced that referring to "data" in all these tests makes
them more
> readable - I have no idea what's in data[26:29] without counting or
using grep
> -n.

> Something like the below shows me exactly what's going
> on without me needing to refer to multiple places,
> and is independent of any changes else in the code.

> It would make it considerably easier for me to scan down
> the tests to ensure that each one represents reasonable
> behaviour.

> {
> description: "lines are longer than buffer size",
> data: []string{
> "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n",
> "0123456789012345678901234567890123456789012345678901\n",
> "the quick brown fox ",
> },
> initialLinesWritten: 1,
> initialLinesRequested: 1,
> bufferSize: 5,
> initialCollectedData: []string{
> "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n",
> },
> appendedCollectedData: []string{
> "0123456789012345678901234567890123456789012345678901\n",
> },
> },

This case the test table would blow up again and also would be
inconsistent (e.g. when using the standard larger group of lines). So
now using three different data variables with more speaking names as a
compromise. Also located them near to the table.

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go#newcode215
utils/tailer/tailer_test.go:215: line, err := buffer.ReadString('\n')
On 2013/12/12 14:46:39, rog wrote:
> If the tailer doesn't produce enough data, we'll block here forever I
think,
> regardless of the timeout.

> I think you probably want something more like the below:

> func assertCollected(c *gc.C, reader io.Reader, compare []string,
injection
> func([]st...

Please take a look.

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer.go
File utils/tailer/tailer.go (right):

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer.go#newcode49
utils/tailer/tailer.go:49: func NewTailer(readSeeker io.ReadSeeker,
writeCloser io.WriteCloser, lines int, filter TailerFilterFunc) *Tailer
{
On 2013/12/12 14:46:39, rog wrote:
> I don't think there's any particular reason we want to give this code
the
> responsibility for closing the writer. It's quite possible we might
have several
> tailers writing to the same writer, for example.

> Better would be a Wait method (and perhaps a Dead method) to wait
until
> it's finished. Then the caller has the option of closing the writer.

Done as discussed.

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go
File utils/tailer/tailer_test.go (right):

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go#newcode40
utils/tailer/tailer_test.go:40: {
On 2013/12/12 14:46:39, rog wrote:
> If we move this brace onto the previous line, we can save a level of
indentation
> in all these tests.

Done.

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go#newcode42
utils/tailer/tailer_test.go:42: data:                  data[26:29],
On 2013/12/12 14:46:39, rog wrote:
> I like the new tests much better now. One thought though:

> I'm not convinced that referring to "data" in all these tests makes
them more
> readable - I have no idea what's in data[26:29] without counting or
using grep
> -n.

> Something like the below shows me exactly what's going
> on without me needing to refer to multiple places,
> and is independent of any changes else in the code.

> It would make it considerably easier for me to scan down
> the tests to ensure that each one represents reasonable
> behaviour.

> {
>      description: "lines are longer than buffer size",
>      data: []string{
>          "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n",
>          "0123456789012345678901234567890123456789012345678901\n",
>          "the quick brown fox ",
>      },
>      initialLinesWritten:   1,
>      initialLinesRequested: 1,
>      bufferSize:            5,
>      initialCollectedData:  []string{
>          "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n",
>      },
>      appendedCollectedData: []string{
>          "0123456789012345678901234567890123456789012345678901\n",
>      },
> },

This case the test table would blow up again and also would be
inconsistent (e.g. when using the standard larger group of lines). So
now using three different data variables with more speaking names as a
compromise. Also located them near to the table.

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go#newcode215
utils/tailer/tailer_test.go:215: line, err := buffer.ReadString('\n')
On 2013/12/12 14:46:39, rog wrote:
> If the tailer doesn't produce enough data, we'll block here forever I
think,
> regardless of the timeout.

> I think you probably want something more like the below:

> func assertCollected(c *gc.C, reader io.Reader, compare []string,
injection
> func([]string)
>       ) {
>     lineChan := make(chan string)
>     go func() {
>         defer close(lineChan)
>         reader := bufio.NewReader(reader)
>         for {
>             line, err := reader.ReadString('\n')
>             if !c.Check(err, gc.IsNil) {
>                 return
>             }
>             lineChan <- line
>         }
>     }()
>     timeout := time.After(testing.LongWait)
>     for {
>          select {
>          case line, ok := <-lineChan:
>               c.Assert(ok, jc.IsTrue)
>               deal with line as currently
>          case <-timeout:
>                timed out
>          }
>      }
> }

Done with external reading goroutine.

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go#newcode222
utils/tailer/tailer_test.go:222: for i := 0; i < len(lines); i++ {
On 2013/12/12 14:46:39, rog wrote:
> c.Assert(lines, gc.DeepEquals, compare) ?

Done.

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go#newcode228
utils/tailer/tailer_test.go:228: if err == io.EOF {
On 2013/12/12 14:46:39, rog wrote:
> Once we've got EOF once on a pipe, I don't think we can get any more
data, can
> we?

Yeah, missed it when moving from buffer to Pipe.

https://codereview.appspot.com/36540043/

Revision history for this message

Roger Peppe (rogpeppe) wrote on 2013-12-13:

#

LGTM with the tests fixed, thanks!

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go
File utils/tailer/tailer_test.go (right):

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go#newcode42
utils/tailer/tailer_test.go:42: data: data[26:29],
On 2013/12/13 11:23:12, mue wrote:
> On 2013/12/12 14:46:39, rog wrote:
> > I like the new tests much better now. One thought though:
> >
> > I'm not convinced that referring to "data" in all these tests makes
them more
> > readable - I have no idea what's in data[26:29] without counting or
using grep
> > -n.
> >
> > Something like the below shows me exactly what's going
> > on without me needing to refer to multiple places,
> > and is independent of any changes else in the code.
> >
> > It would make it considerably easier for me to scan down
> > the tests to ensure that each one represents reasonable
> > behaviour.
> >
> > {
> > description: "lines are longer than buffer size",
> > data: []string{
> > "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n",
> > "0123456789012345678901234567890123456789012345678901\n",
> > "the quick brown fox ",
> > },
> > initialLinesWritten: 1,
> > initialLinesRequested: 1,
> > bufferSize: 5,
> > initialCollectedData: []string{
> > "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n",
> > },
> > appendedCollectedData: []string{
> > "0123456789012345678901234567890123456789012345678901\n",
> > },
> > },
> >

> This case the test table would blow up again and also would be
inconsistent
> (e.g. when using the standard larger group of lines). So now using
three
> different data variables with more speaking names as a compromise.
Also located
> them near to the table.

I'm with you about alphabetData (which is also nicely memorable, which
helps), but most of them do really benefit from having the data visible
in the test.

I processed my local copy to move much of the test data into the tests,
and two issues became immediately obvious, though I'd missed them when
cross-referencing, which I think is a reasonable indication that the
tests are not currently that clear.

Here's my modified version of the test. Only alphabetData remains. I'd
prefer it if we could use this form.

http://paste.ubuntu.com/6566524/

https://codereview.appspot.com/36540043/diff/100001/utils/tailer/tailer_test.go
File utils/tailer/tailer_test.go (right):

https://codereview.appspot.com/36540043/diff/100001/utils/tailer/tailer_test.go#newcode100
utils/tailer/tailer_test.go:100: description: "lines are
longer than buffer size, missing termination of last line",
How is the last line missing termination here?

https://codereview.appspot.com/36540043/diff/100001/utils/tailer/tailer_test.go#newcode117
utils/tailer/tailer_test.go:117: data:
unterminatedData[0:2],
The last line doesn't seem to be missing termination here.

https://codereview.appspot.com/36540043/

LGTM with the tests fixed, thanks!

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go
File utils/tailer/tailer_test.go (right):

https://codereview.appspot.com/36540043/diff/80001/utils/tailer/tailer_test.go#newcode42
utils/tailer/tailer_test.go:42: data:                  data[26:29],
On 2013/12/13 11:23:12, mue wrote:
> On 2013/12/12 14:46:39, rog wrote:
> > I like the new tests much better now. One thought though:
> >
> > I'm not convinced that referring to "data" in all these tests makes
them more
> > readable - I have no idea what's in data[26:29] without counting or
using grep
> > -n.
> >
> > Something like the below shows me exactly what's going
> > on without me needing to refer to multiple places,
> > and is independent of any changes else in the code.
> >
> > It would make it considerably easier for me to scan down
> > the tests to ensure that each one represents reasonable
> > behaviour.
> >
> > {
> >     description: "lines are longer than buffer size",
> >     data: []string{
> >         "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n",
> >         "0123456789012345678901234567890123456789012345678901\n",
> >         "the quick brown fox ",
> >     },
> >     initialLinesWritten:   1,
> >     initialLinesRequested: 1,
> >     bufferSize:            5,
> >     initialCollectedData:  []string{
> >         "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n",
> >     },
> >     appendedCollectedData: []string{
> >         "0123456789012345678901234567890123456789012345678901\n",
> >     },
> > },
> >

> This case the test table would blow up again and also would be
inconsistent
> (e.g. when using the standard larger group of lines). So now using
three
> different data variables with more speaking names as a compromise.
Also located
> them near to the table.

I'm with you about alphabetData (which is also nicely memorable, which
helps), but most of them do really benefit from having the data visible
in the test.

I processed my local copy to move much of the test data into the tests,
and two issues became immediately obvious, though I'd missed them when
cross-referencing, which I think is a reasonable indication that the
tests are not currently that clear.

Here's my modified version of the test. Only alphabetData remains. I'd
prefer it if we could use this form.

http://paste.ubuntu.com/6566524/

https://codereview.appspot.com/36540043/diff/100001/utils/tailer/tailer_test.go
File utils/tailer/tailer_test.go (right):

https://codereview.appspot.com/36540043/diff/100001/utils/tailer/tailer_test.go#newcode100
utils/tailer/tailer_test.go:100: description:           "lines are
longer than buffer size, missing termination of last line",
How is the last line missing termination here?

https://codereview.appspot.com/36540043/diff/100001/utils/tailer/tailer_test.go#newcode117
utils/tailer/tailer_test.go:117: data:
unterminatedData[0:2],
The last line doesn't seem to be missing termination here.

https://codereview.appspot.com/36540043/

Revision history for this message

Frank Mueller (themue) wrote on 2013-12-13:

#

Please take a look.

https://codereview.appspot.com/36540043/diff/100001/utils/tailer/tailer_test.go
File utils/tailer/tailer_test.go (right):

https://codereview.appspot.com/36540043/diff/100001/utils/tailer/tailer_test.go#newcode100
utils/tailer/tailer_test.go:100: description: "lines are
longer than buffer size, missing termination of last line",
On 2013/12/13 12:52:14, rog wrote:
> How is the last line missing termination here?

Done.

https://codereview.appspot.com/36540043/diff/100001/utils/tailer/tailer_test.go#newcode117
utils/tailer/tailer_test.go:117: data:
unterminatedData[0:2],
On 2013/12/13 12:52:14, rog wrote:
> The last line doesn't seem to be missing termination here.

Done.

https://codereview.appspot.com/36540043/

juju-core

Merge lp:~themue/juju-core/057-tailer into lp:~go-bot/juju-core/trunk

Commit message

Description of the change

Preview Diff

Subscribers

 === added directory 'utils/tailer'
 === added file 'utils/tailer/export_test.go'
 --- utils/tailer/export_test.go	1970-01-01 00:00:00 +0000
 +++ utils/tailer/export_test.go	2013-12-13 13:24:18 +0000
@@ -0,0 +1,6 @@
++// Copyright 2013 Canonical Ltd.
++// Licensed under the AGPLv3, see LICENCE file for details.
++
++package tailer
++
++var NewTestTailer = newTailer
 === added file 'utils/tailer/tailer.go'
 --- utils/tailer/tailer.go	1970-01-01 00:00:00 +0000
 +++ utils/tailer/tailer.go	2013-12-13 13:24:18 +0000
@@ -0,0 +1,250 @@
++// Copyright 2013 Canonical Ltd.
++// Licensed under the AGPLv3, see LICENCE file for details.
++
++package tailer
++
++import (
++	"bufio"
++	"bytes"
++	"io"
++	"os"
++	"time"
++
++	"launchpad.net/tomb"
++)
++
++const (
++	bufferSize = 4096
++	polltime   = time.Second
++	delimiter  = '\n'
++)
++
++var (
++	delimiters = []byte{delimiter}
++)
++
++// TailerFilterFunc decides if a line shall be tailed (func is nil or
++// returns true) of shall be omitted (func returns false).
++type TailerFilterFunc func(line []byte) bool
++
++// Tailer reads an input line by line an tails them into the passed Writer.
++// The lines have to be terminated with a newline.
++type Tailer struct {
++	tomb        tomb.Tomb
++	readSeeker  io.ReadSeeker
++	reader      *bufio.Reader
++	writeCloser io.WriteCloser
++	writer      *bufio.Writer
++	lines       int
++	filter      TailerFilterFunc
++	bufferSize  int
++	polltime    time.Duration
++}
++
++// NewTailer starts a Tailer which reads strings from the passed
++// ReadSeeker line by line. If a filter function is specified the read
++// lines are filtered. The matching lines are written to the passed
++// Writer. The reading begins the specified number of matching lines
++// from the end.
++func NewTailer(readSeeker io.ReadSeeker, writer io.Writer, lines int, filter TailerFilterFunc) *Tailer {
++	return newTailer(readSeeker, writer, lines, filter, bufferSize, polltime)
++}
++
++// newTailer starts a Tailer like NewTailer but allows the setting of
++// the read buffer size and the time between pollings for testing.
++func newTailer(readSeeker io.ReadSeeker, writer io.Writer, lines int, filter TailerFilterFunc,
++	bufferSize int, polltime time.Duration) *Tailer {
++	t := &Tailer{
++		readSeeker: readSeeker,
++		reader:     bufio.NewReaderSize(readSeeker, bufferSize),
++		writer:     bufio.NewWriter(writer),
++		lines:      lines,
++		filter:     filter,
++		bufferSize: bufferSize,
++		polltime:   polltime,
++	}
++	go func() {
++		defer t.tomb.Done()
++		t.tomb.Kill(t.loop())
++	}()
++	return t
++}
++
++// Stop tells the tailer to stop working.
++func (t *Tailer) Stop() error {
++	t.tomb.Kill(nil)
++	return t.tomb.Wait()
++}
++
++// Wait waits until the tailer is stopped due to command
++// or an error. In case of an error it returns the reason.
++func (t *Tailer) Wait() error {
++	return t.tomb.Wait()
++}
++
++// Dead returns the channel that can be used to wait until
++// the tailer is stopped.
++func (t *Tailer) Dead() <-chan struct{} {
++	return t.tomb.Dead()
++}
++
++// Err returns a possible error.
++func (t *Tailer) Err() error {
++	return t.tomb.Err()
++}
++
++// loop writes the last lines based on the buffer size to the
++// writer and then polls for more data to write it to the
++// writer too.
++func (t *Tailer) loop() error {
++	// Position the readSeeker.
++	if err := t.seekLastLines(); err != nil {
++		return err
++	}
++	// Start polling.
++	// TODO(mue) 2013-12-06
++	// Handling of read-seeker/files being truncated during
++	// tailing is currently missing!
++	timer := time.NewTimer(0)
++	for {
++		select {
++		case <-t.tomb.Dying():
++			return nil
++		case <-timer.C:
++			for {
++				line, readErr := t.readLine()
++				_, writeErr := t.writer.Write(line)
++				if writeErr != nil {
++					return writeErr
++				}
++				if readErr != nil {
++					if readErr != io.EOF {
++						return readErr
++					}
++					break
++				}
++			}
++			if writeErr := t.writer.Flush(); writeErr != nil {
++				return writeErr
++			}
++			timer.Reset(t.polltime)
++		}
++	}
++}
++
++// seekLastLines sets the read position of the ReadSeeker to the
++// wanted number of filtered lines before the end.
++func (t *Tailer) seekLastLines() error {
++	offset, err := t.readSeeker.Seek(0, os.SEEK_END)
++	if err != nil {
++		return err
++	}
++	seekPos := int64(0)
++	found := 0
++	buffer := make([]byte, t.bufferSize)
++SeekLoop:
++	for offset > 0 {
++		// buffer contains the data left over from the
++		// previous iteration.
++		space := cap(buffer) - len(buffer)
++		if space < t.bufferSize {
++			// Grow buffer.
++			newBuffer := make([]byte, len(buffer), cap(buffer)*2)
++			copy(newBuffer, buffer)
++			buffer = newBuffer
++			space = cap(buffer) - len(buffer)
++		}
++		if int64(space) > offset {
++			// Use exactly the right amount of space if there's
++			// only a small amount remaining.
++			space = int(offset)
++		}
++		// Copy data remaining from last time to the end of the buffer,
++		// so we can read into the right place.
++		copy(buffer[space:cap(buffer)], buffer)
++		buffer = buffer[0 : len(buffer)+space]
++		offset -= int64(space)
++		_, err := t.readSeeker.Seek(offset, os.SEEK_SET)
++		if err != nil {
++			return err
++		}
++		_, err = io.ReadFull(t.readSeeker, buffer[0:space])
++		if err != nil {
++			return err
++		}
++		// Find the end of the last line in the buffer.
++		// This will discard any unterminated line at the end
++		// of the file.
++		end := bytes.LastIndex(buffer, delimiters)
++		if end == -1 {
++			// No end of line found - discard incomplete
++			// line and continue looking. If this happens
++			// at the beginning of the file, we don't care
++			// because we're going to stop anyway.
++			buffer = buffer[:0]
++			continue
++		}
++		end++
++		for {
++			start := bytes.LastIndex(buffer[0:end-1], delimiters)
++			if start == -1 && offset >= 0 {
++				break
++			}
++			start++
++			if t.isValid(buffer[start:end]) {
++				found++
++				if found >= t.lines {
++					seekPos = offset + int64(start)
++					break SeekLoop
++				}
++			}
++			end = start
++		}
++		// Leave the last line in buffer, as we don't know whether
++		// it's complete or not.
++		buffer = buffer[0:end]
++	}
++	// Final positioning.
++	t.readSeeker.Seek(seekPos, os.SEEK_SET)
++	return nil
++}
++
++// readLine reads the next valid line from the reader, even if it is
++// larger than the reader buffer.
++func (t *Tailer) readLine() ([]byte, error) {
++	for {
++		slice, err := t.reader.ReadSlice(delimiter)
++		if err == nil {
++			if t.isValid(slice) {
++				return slice, nil
++			}
++			continue
++		}
++		line := append([]byte(nil), slice...)
++		for err == bufio.ErrBufferFull {
++			slice, err = t.reader.ReadSlice(delimiter)
++			line = append(line, slice...)
++		}
++		switch err {
++		case nil:
++			if t.isValid(line) {
++				return line, nil
++			}
++		case io.EOF:
++			// EOF without delimiter, step back.
++			t.readSeeker.Seek(-int64(len(line)), os.SEEK_CUR)
++			return nil, err
++		default:
++			return nil, err
++		}
++	}
++}
++
++// isValid checks if the passed line is valid by checking if the
++// line has content, the filter function is nil or it returns true.
++func (t *Tailer) isValid(line []byte) bool {
++	if t.filter == nil {
++		return true
++	}
++	return t.filter(line)
++}
 === added file 'utils/tailer/tailer_test.go'
 --- utils/tailer/tailer_test.go	1970-01-01 00:00:00 +0000
 +++ utils/tailer/tailer_test.go	2013-12-13 13:24:18 +0000
@@ -0,0 +1,496 @@
++// Copyright 2013 Canonical Ltd.
++// Licensed under the AGPLv3, see LICENCE file for details.
++
++package tailer_test
++
++import (
++	"bufio"
++	"bytes"
++	"fmt"
++	"io"
++	"sync"
++	stdtesting "testing"
++	"time"
++
++	gc "launchpad.net/gocheck"
++
++	"launchpad.net/juju-core/testing"
++	"launchpad.net/juju-core/utils/tailer"
++)
++
++func Test(t *stdtesting.T) {
++	gc.TestingT(t)
++}
++
++type tailerSuite struct{}
++
++var _ = gc.Suite(tailerSuite{})
++
++var alphabetData = []string{
++	"alpha alpha\n",
++	"bravo bravo\n",
++	"charlie charlie\n",
++	"delta delta\n",
++	"echo echo\n",
++	"foxtrott foxtrott\n",
++	"golf golf\n",
++	"hotel hotel\n",
++	"india india\n",
++	"juliet juliet\n",
++	"kilo kilo\n",
++	"lima lima\n",
++	"mike mike\n",
++	"november november\n",
++	"oscar oscar\n",
++	"papa papa\n",
++	"quebec quebec\n",
++	"romeo romeo\n",
++	"sierra sierra\n",
++	"tango tango\n",
++	"uniform uniform\n",
++	"victor victor\n",
++	"whiskey whiskey\n",
++	"x-ray x-ray\n",
++	"yankee yankee\n",
++	"zulu zulu\n",
++}
++
++var tests = []struct {
++	description           string
++	data                  []string
++	initialLinesWritten   int
++	initialLinesRequested int
++	bufferSize            int
++	filter                tailer.TailerFilterFunc
++	injector              func(*tailer.Tailer, *readSeeker) func([]string)
++	initialCollectedData  []string
++	appendedCollectedData []string
++	err                   string
++}{{
++	description: "lines are longer than buffer size",
++	data: []string{
++		"abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n",
++		"0123456789012345678901234567890123456789012345678901\n",
++	},
++	initialLinesWritten:   1,
++	initialLinesRequested: 1,
++	bufferSize:            5,
++	initialCollectedData: []string{
++		"abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n",
++	},
++	appendedCollectedData: []string{
++		"0123456789012345678901234567890123456789012345678901\n",
++	},
++}, {
++	description: "lines are longer than buffer size, missing termination of last line",
++	data: []string{
++		"abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n",
++		"0123456789012345678901234567890123456789012345678901\n",
++		"the quick brown fox ",
++	},
++	initialLinesWritten:   1,
++	initialLinesRequested: 1,
++	bufferSize:            5,
++	initialCollectedData: []string{
++		"abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n",
++	},
++	appendedCollectedData: []string{
++		"0123456789012345678901234567890123456789012345678901\n",
++	},
++}, {
++	description: "lines are longer than buffer size, last line is terminated later",
++	data: []string{
++		"abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n",
++		"0123456789012345678901234567890123456789012345678901\n",
++		"the quick brown fox ",
++		"jumps over the lazy dog\n",
++	},
++	initialLinesWritten:   1,
++	initialLinesRequested: 1,
++	bufferSize:            5,
++	initialCollectedData: []string{
++		"abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n",
++	},
++	appendedCollectedData: []string{
++		"0123456789012345678901234567890123456789012345678901\n",
++		"the quick brown fox jumps over the lazy dog\n",
++	},
++}, {
++	description: "missing termination of last line",
++	data: []string{
++		"abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n",
++		"0123456789012345678901234567890123456789012345678901\n",
++		"the quick brown fox ",
++	},
++	initialLinesWritten:   1,
++	initialLinesRequested: 1,
++	initialCollectedData: []string{
++		"abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n",
++	},
++	appendedCollectedData: []string{
++		"0123456789012345678901234567890123456789012345678901\n",
++	},
++}, {
++	description: "last line is terminated later",
++	data: []string{
++		"abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n",
++		"0123456789012345678901234567890123456789012345678901\n",
++		"the quick brown fox ",
++		"jumps over the lazy dog\n",
++	},
++	initialLinesWritten:   1,
++	initialLinesRequested: 1,
++	initialCollectedData: []string{
++		"abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\n",
++	},
++	appendedCollectedData: []string{
++		"0123456789012345678901234567890123456789012345678901\n",
++		"the quick brown fox jumps over the lazy dog\n",
++	},
++}, {
++	description:           "more lines already written than initially requested",
++	data:                  alphabetData,
++	initialLinesWritten:   5,
++	initialLinesRequested: 3,
++	initialCollectedData: []string{
++		"charlie charlie\n",
++		"delta delta\n",
++		"echo echo\n",
++	},
++	appendedCollectedData: alphabetData[5:],
++}, {
++	description:           "less lines already written than initially requested",
++	data:                  alphabetData,
++	initialLinesWritten:   3,
++	initialLinesRequested: 5,
++	initialCollectedData: []string{
++		"alpha alpha\n",
++		"bravo bravo\n",
++		"charlie charlie\n",
++	},
++	appendedCollectedData: alphabetData[3:],
++}, {
++	description:           "lines are longer than buffer size, more lines already written than initially requested",
++	data:                  alphabetData,
++	initialLinesWritten:   5,
++	initialLinesRequested: 3,
++	bufferSize:            5,
++	initialCollectedData: []string{
++		"charlie charlie\n",
++		"delta delta\n",
++		"echo echo\n",
++	},
++	appendedCollectedData: alphabetData[5:],
++}, {
++	description:           "lines are longer than buffer size, less lines already written than initially requested",
++	data:                  alphabetData,
++	initialLinesWritten:   3,
++	initialLinesRequested: 5,
++	bufferSize:            5,
++	initialCollectedData: []string{
++		"alpha alpha\n",
++		"bravo bravo\n",
++		"charlie charlie\n",
++	},
++	appendedCollectedData: alphabetData[3:],
++}, {
++	description:           "filter lines which contain the char 'e'",
++	data:                  alphabetData,
++	initialLinesWritten:   10,
++	initialLinesRequested: 3,
++	filter: func(line []byte) bool {
++		return bytes.Contains(line, []byte{'e'})
++	},
++	initialCollectedData: []string{
++		"echo echo\n",
++		"hotel hotel\n",
++		"juliet juliet\n",
++	},
++	appendedCollectedData: []string{
++		"mike mike\n",
++		"november november\n",
++		"quebec quebec\n",
++		"romeo romeo\n",
++		"sierra sierra\n",
++		"whiskey whiskey\n",
++		"yankee yankee\n",
++	},
++}, {
++	description:           "stop tailing after 10 collected lines",
++	data:                  alphabetData,
++	initialLinesWritten:   5,
++	initialLinesRequested: 3,
++	injector: func(t *tailer.Tailer, rs *readSeeker) func([]string) {
++		return func(lines []string) {
++			if len(lines) == 10 {
++				t.Stop()
++			}
++		}
++	},
++	initialCollectedData: []string{
++		"charlie charlie\n",
++		"delta delta\n",
++		"echo echo\n",
++	},
++	appendedCollectedData: alphabetData[5:],
++}, {
++	description:           "generate an error after 10 collected lines",
++	data:                  alphabetData,
++	initialLinesWritten:   5,
++	initialLinesRequested: 3,
++	injector: func(t *tailer.Tailer, rs *readSeeker) func([]string) {
++		return func(lines []string) {
++			if len(lines) == 10 {
++				rs.setError(fmt.Errorf("ouch after 10 lines"))
++			}
++		}
++	},
++	initialCollectedData: []string{
++		"charlie charlie\n",
++		"delta delta\n",
++		"echo echo\n",
++	},
++	appendedCollectedData: alphabetData[5:],
++	err: "ouch after 10 lines",
++}, {
++	description: "more lines already written than initially requested, some empty, unfiltered",
++	data: []string{
++		"one one\n",
++		"two two\n",
++		"\n",
++		"\n",
++		"three three\n",
++		"four four\n",
++		"\n",
++		"\n",
++		"five five\n",
++		"six six\n",
++	},
++	initialLinesWritten:   3,
++	initialLinesRequested: 2,
++	initialCollectedData: []string{
++		"two two\n",
++		"\n",
++	},
++	appendedCollectedData: []string{
++		"\n",
++		"three three\n",
++		"four four\n",
++		"\n",
++		"\n",
++		"five five\n",
++		"six six\n",
++	},
++}, {
++	description: "more lines already written than initially requested, some empty, those filtered",
++	data: []string{
++		"one one\n",
++		"two two\n",
++		"\n",
++		"\n",
++		"three three\n",
++		"four four\n",
++		"\n",
++		"\n",
++		"five five\n",
++		"six six\n",
++	},
++	initialLinesWritten:   3,
++	initialLinesRequested: 2,
++	filter: func(line []byte) bool {
++		return len(bytes.TrimSpace(line)) > 0
++	},
++	initialCollectedData: []string{
++		"one one\n",
++		"two two\n",
++	},
++	appendedCollectedData: []string{
++		"three three\n",
++		"four four\n",
++		"five five\n",
++		"six six\n",
++	},
++}}
++
++func (tailerSuite) TestTailer(c *gc.C) {
++	for i, test := range tests {
++		c.Logf("Test #%d) %s", i, test.description)
++		bufferSize := test.bufferSize
++		if bufferSize == 0 {
++			// Default value.
++			bufferSize = 4096
++		}
++		reader, writer := io.Pipe()
++		sigc := make(chan struct{}, 1)
++		rs := startReadSeeker(c, test.data, test.initialLinesWritten, sigc)
++		tailer := tailer.NewTestTailer(rs, writer, test.initialLinesRequested, test.filter, bufferSize, 2*time.Millisecond)
++		linec := startReading(c, tailer, reader, writer)
++
++		// Collect initial data.
++		assertCollected(c, linec, test.initialCollectedData, nil)
++
++		sigc <- struct{}{}
++
++		// Collect remaining data, possibly with injection to stop
++		// earlier or generate an error.
++		var injection func([]string)
++		if test.injector != nil {
++			injection = test.injector(tailer, rs)
++		}
++
++		assertCollected(c, linec, test.appendedCollectedData, injection)
++
++		if test.err == "" {
++			c.Assert(tailer.Stop(), gc.IsNil)
++		} else {
++			c.Assert(tailer.Err(), gc.ErrorMatches, test.err)
++		}
++	}
++}
++
++// startReading starts a goroutine receiving the lines out of the reader
++// in the background and passing them to a created string channel. This
++// will used in the assertions.
++func startReading(c *gc.C, tailer *tailer.Tailer, reader *io.PipeReader, writer *io.PipeWriter) chan string {
++	linec := make(chan string)
++	// Start goroutine for reading.
++	go func() {
++		defer close(linec)
++		reader := bufio.NewReader(reader)
++		for {
++			line, err := reader.ReadString('\n')
++			switch err {
++			case nil:
++				linec <- line
++			case io.EOF:
++				return
++			default:
++				c.Fail()
++			}
++		}
++	}()
++	// Close writer when tailer is stopped or has an error. Tailer using
++	// components can do it the same way.
++	go func() {
++		tailer.Wait()
++		writer.Close()
++	}()
++	return linec
++}
++
++// assertCollected reads lines from the string channel linec. It compares if
++// those are the one passed with compare until a timeout. If the timeout is
++// reached earlier than all lines are collected the assertion fails. The
++// injection function allows to interrupt the processing with a function
++// generating an error or a regular stopping during the tailing. In case the
++// linec is closed due to stopping or an error only the values so far care
++// compared. Checking the reason for termination is done in the test.
++func assertCollected(c *gc.C, linec chan string, compare []string, injection func([]string)) {
++	timeout := time.After(testing.LongWait)
++	lines := []string{}
++	for {
++		select {
++		case line, ok := <-linec:
++			if ok {
++				lines = append(lines, line)
++				if injection != nil {
++					injection(lines)
++				}
++				if len(lines) == len(compare) {
++					// All data received.
++					c.Assert(lines, gc.DeepEquals, compare)
++					return
++				}
++			} else {
++				// linec closed after stopping or error.
++				c.Assert(lines, gc.DeepEquals, compare[:len(lines)])
++				return
++			}
++		case <-timeout:
++			if injection == nil {
++				c.Fatalf("timeout during tailer collection")
++			}
++			return
++		}
++	}
++}
++
++// startReadSeeker returns a ReadSeeker for the Tailer. It simulates
++// reading and seeking inside a file and also simulating an error.
++// The goroutine waits for a signal that it can start writing the
++// appended lines.
++func startReadSeeker(c *gc.C, data []string, initialLeg int, sigc chan struct{}) *readSeeker {
++	// Write initial lines into the buffer.
++	var rs readSeeker
++	var i int
++	for i = 0; i < initialLeg; i++ {
++		rs.write(data[i])
++	}
++
++	go func() {
++		<-sigc
++
++		for ; i < len(data); i++ {
++			time.Sleep(5 * time.Millisecond)
++			rs.write(data[i])
++		}
++	}()
++	return &rs
++}
++
++type readSeeker struct {
++	mux    sync.Mutex
++	buffer []byte
++	pos    int
++	err    error
++}
++
++func (r *readSeeker) write(s string) {
++	r.mux.Lock()
++	defer r.mux.Unlock()
++	r.buffer = append(r.buffer, []byte(s)...)
++}
++
++func (r *readSeeker) setError(err error) {
++	r.mux.Lock()
++	defer r.mux.Unlock()
++	r.err = err
++}
++
++func (r *readSeeker) Read(p []byte) (n int, err error) {
++	r.mux.Lock()
++	defer r.mux.Unlock()
++	if r.err != nil {
++		return 0, r.err
++	}
++	if r.pos >= len(r.buffer) {
++		return 0, io.EOF
++	}
++	n = copy(p, r.buffer[r.pos:])
++	r.pos += n
++	return n, nil
++}
++
++func (r *readSeeker) Seek(offset int64, whence int) (ret int64, err error) {
++	r.mux.Lock()
++	defer r.mux.Unlock()
++	var newPos int64
++	switch whence {
++	case 0:
++		newPos = offset
++	case 1:
++		newPos = int64(r.pos) + offset
++	case 2:
++		newPos = int64(len(r.buffer)) + offset
++	default:
++		return 0, fmt.Errorf("invalid whence: %d", whence)
++	}
++	if newPos < 0 {
++		return 0, fmt.Errorf("negative position: %d", newPos)
++	}
++	if newPos >= 1<<31 {
++		return 0, fmt.Errorf("position out of range: %d", newPos)
++	}
++	r.pos = int(newPos)
++	return newPos, nil
++}