Last commit made on 2023-09-20
Get this branch:
git clone -b main

Branch merges

Branch information


Recent commits

79f44f3... by Edward Hope-Morley

Remove deprecated args and MPCacheSharded

6ae75b9... by Edward Hope-Morley

Cleanup the new binary search code and add extras

 * reduces sixe of log messages to essential information
 * cleanup code style consistency and docstring
 * ensure first and last line checked first before
   full bisect

d917a81... by Edward Hope-Morley

Merge pull request #10 from mustafakemalgilor/enhancement/faster-search-since-constraint

searchkit/constraints: rewrite of binary search algorithm

12a6c36... by Mustafa Kemal Gilor

searchkit/constraints: rewrite of binary search algorithm

Implemented a new binary search algorithm that no longer needs
filemarkers or knowing the lines beforehand, which reduces the
time spent applying a SearchConstraintSearchSince to a file,
especially if the file is large in size.

Removed the following classes which are no longer necessary:

- SkipRange
- SkipRangeOverlapException
- BinarySearchState
- FileMarkers (and respective unit tests)
- SeekInfo

Removed `test_logs_since_junk_not_allow_unverifiable` test case since
we're no longer parsing all lines in the file.

Removed following functions from BinarySeekSearchBase:

- _seek_and_validate
- _check_line
- _seek_next

Introduced the following new classes:

- LogFileDateSinceOffsetSeeker (the main binary search class)
- DateSearchFailedAtOffset (exception type)
- NoLogsFoundSince (exception type)
- NoDateFoundInLogs (exception type)

Signed-off-by: Mustafa Kemal Gilor <email address hidden>

b9e5674... by Edward Hope-Morley

Support alternate unicode decide error handlers

Instead of only supporting "strict" mode and silently skipping
files that raise a UnicodeDecodeError we now raise the error
and add a new "decode_errors" kwarg to FileSearcher that supports
setting alternate handlers such as backslashescape, ignore etc.

Also fixes unit tests logger.

94a264a... by Edward Hope-Morley

Configure logging handler if none exists

ab21741... by Edward Hope-Morley

Set logger name

cede539... by Edward Hope-Morley

Remove dependency on _gdbm

This import is not from stdlib so we now use dbm which is.

bcfd6e3... by Edward Hope-Morley

Improve full search result debug message

Adds search def tag to message to help identify inefficient
search expressions.

c4a0b3b... by Edward Hope-Morley

Fixes case where search paths overlap

If the paths used to register searches overlap once
expanded, they will cause the same file to be searched
concurrently which breaks the MPCache and is also
superfluous. This patch fixes that problem and applies
some minor optimisations to the way we extract
datetime from start of line to apply constraints.

Also fixes support for applying constraints to files
containing unicode characters by ensuring that we escape
rather than decode those charaters.