https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode98
utils/tailer.go:98: if buffer[i] == '\n' {
This isn't strictly accurate (what happens when we have an unterminated
line at the end of the file?) but it's probably ok if we never filter
unterminated lines. A comment about why it's ok might prevent future
puzzling over the correctness of the code.
Unfortunately that limits the line size to the size of the bufio buffer,
which may be acceptable. An alternative is to use ReadLine and allocate
only if the line is too long.
Here's some code I wrote a while ago to do that, in case it might be
useful:
// readLine reads a line from r.
// The returned byte slice is only valid until the
// next read call on r.
func readLine(r *bufio.Reader) ([]byte, error) {
line, isPrefix, err := r.ReadLine()
if !isPrefix {
return line, err
}
buf := append([]byte(nil), line...)
for isPrefix && err == nil {
line, isPrefix, err = r.ReadLine()
buf = append(buf, line...)
}
return buf, err
}
https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode149
utils/tailer.go:149: if len(line) > 0 {
I think we should ignore the line at this point if it is unterminated,
something that's quite possible if the file is being written in
arbitrary chunks. If we find an unterminated line, we should seek back
to the start of the line before we start reading again.
That means we could pass the line to filter without the trailing \n,
meaning we can be more resilient if something starts producing lines
ending in \r\n.
https://codereview.appspot.com/36540043/diff/20001/utils/tailer.go#newcode162
utils/tailer.go:162: t.readSeeker.Seek(0, os.SEEK_END)
This is wrong, I think - we could miss data if some has been
written in between seeing the EOF and this seek.
I think the only time we want to seek is if we get an unterminated line.
That said, there's another tricky case - what happens if the file gets
truncated? I wonder if that should be done with separate logic to detect
whether the file has suddenly reduced in size. We'll need separate logic
anyway to decide when to reopen the file when log files roll over, so
that might work out ok.
This is a good start, but I have a few comments and suggestions.
I wonder if it might sit well inside its own package rather being added
to the grab-bag of stuff in utils.
https:/ /codereview. appspot. com/36540043/ diff/20001/ utils/tailer. go
File utils/tailer.go (right):
https:/ /codereview. appspot. com/36540043/ diff/20001/ utils/tailer. go#newcode30 (filename string, lines int,
utils/tailer.go:30: func StartFileTailer
filter TailerFilterFunc,
I think this function is unnecessary. It's trivial for other code to
open a file.
https:/ /codereview. appspot. com/36540043/ diff/20001/ utils/tailer. go#newcode40 readSeeker io.ReadSeeker, lines
utils/tailer.go:40: func StartTailer(
int, filter TailerFilterFunc,
Please document this function properly.
https:/ /codereview. appspot. com/36540043/ diff/20001/ utils/tailer. go#newcode87
utils/tailer.go:87: println("> error:", err.Error())
d
https:/ /codereview. appspot. com/36540043/ diff/20001/ utils/tailer. go#newcode98
utils/tailer.go:98: if buffer[i] == '\n' {
This isn't strictly accurate (what happens when we have an unterminated
line at the end of the file?) but it's probably ok if we never filter
unterminated lines. A comment about why it's ok might prevent future
puzzling over the correctness of the code.
https:/ /codereview. appspot. com/36540043/ diff/20001/ utils/tailer. go#newcode100 go:100: if foundNewlines-1 == t.lines {
utils/tailer.
s/==/>=/
defensively.
Also, if we move the if statement before the increment,
then you can lose the "-1".
https:/ /codereview. appspot. com/36540043/ diff/20001/ utils/tailer. go#newcode127 go:127: line, err := reader. ReadString( '\n')
utils/tailer.
I think I'd be tempted to use ReadSlice here (changing the type of
filter to func([]byte)bool.
That way the tailer won't need to generate any garbage at all AFAICS.
Unfortunately that limits the line size to the size of the bufio buffer,
which may be acceptable. An alternative is to use ReadLine and allocate
only if the line is too long.
Here's some code I wrote a while ago to do that, in case it might be
useful:
// readLine reads a line from r.
// The returned byte slice is only valid until the
// next read call on r.
func readLine(r *bufio.Reader) ([]byte, error) {
line, isPrefix, err := r.ReadLine()
if !isPrefix {
return line, err
}
buf := append([]byte(nil), line...)
for isPrefix && err == nil {
line, isPrefix, err = r.ReadLine()
buf = append(buf, line...)
}
return buf, err
}
https:/ /codereview. appspot. com/36540043/ diff/20001/ utils/tailer. go#newcode147 go:147: for {
utils/tailer.
This code is exactly the same as the code above.
Can't we do with just this? (if we just delete the
loop above and use NewTimer(0), I think it might
just work).
https:/ /codereview. appspot. com/36540043/ diff/20001/ utils/tailer. go#newcode149 go:149: if len(line) > 0 {
utils/tailer.
I think we should ignore the line at this point if it is unterminated,
something that's quite possible if the file is being written in
arbitrary chunks. If we find an unterminated line, we should seek back
to the start of the line before we start reading again.
That means we could pass the line to filter without the trailing \n,
meaning we can be more resilient if something starts producing lines
ending in \r\n.
https:/ /codereview. appspot. com/36540043/ diff/20001/ utils/tailer. go#newcode151 go:151: writer. WriteString( line)
utils/tailer.
We should definitely check the error here and in Flush below - I don't
think we want to carry on trying to write to a dead network connection
or full filesystem.
https:/ /codereview. appspot. com/36540043/ diff/20001/ utils/tailer. go#newcode162 go:162: t.readSeeker. Seek(0, os.SEEK_END)
utils/tailer.
This is wrong, I think - we could miss data if some has been
written in between seeing the EOF and this seek.
I think the only time we want to seek is if we get an unterminated line.
That said, there's another tricky case - what happens if the file gets
truncated? I wonder if that should be done with separate logic to detect
whether the file has suddenly reduced in size. We'll need separate logic
anyway to decide when to reopen the file when log files roll over, so
that might work out ok.
https:/ /codereview. appspot. com/36540043/ diff/20001/ utils/tailer_ test.go test.go (right):
File utils/tailer_
https:/ /codereview. appspot. com/36540043/ diff/20001/ utils/tailer_ test.go# newcode25 test.go: 25: buffer := bytes.NewBuffer ([]byte{ })
utils/tailer_
var buffer bytes.Buffer
is simpler and does almost the same thing.
https:/ /codereview. appspot. com/36540043/ diff/20001/ utils/tailer_ test.go# newcode124 test.go: 124: func assertCollected(c *gc.C, buffer
utils/tailer_
*bytes.Buffer, collected []string, addon func([]string), timeout bool) {
Please could we have a comment describing what this function is doing,
please?
https:/ /codereview. appspot. com/36540043/ diff/20001/ utils/tailer_ test.go# newcode162 test.go: 162: var rs *readSeeker = new(readSeeker)
utils/tailer_
var rs readSeeker
https:/ /codereview. appspot. com/36540043/ diff/20001/ utils/tailer_ test.go# newcode206 test.go: 206: if r.pos == len(r.buffer) {
utils/tailer_
I think we can usefully use the result of copy here:
if r.pos >= len(r.buffer) {
return 0, io.EOF
}
n := copy(p, r.buffer[r.pos:])
r.pos += n
return n, nil
https:/ /codereview. appspot. com/36540043/ diff/20001/ utils/tailer_ test.go# newcode244 test.go: 244: func signal(c *gc.C, sigc chan struct{}) {
utils/tailer_
Given that the signal channel is buffered, can't we just send on the
channel?
https:/ /codereview. appspot. com/36540043/