babeltrace:stable-2.0

Last commit made on 2020-07-17
Get this branch:
git clone -b stable-2.0 https://git.launchpad.net/babeltrace

Branch merges

Branch information

Name:
stable-2.0
Repository:
lp:babeltrace

Recent commits

6b1463e... by Jérémie Galarneau <email address hidden> on 2020-07-16

Fix: source.ctf.lttng-live: muxing failure on clear (unit conversion)

This commit is a follow-up fix for 8ec4d5ff (see original message).
The original fix included a bogus comparison of:
  `stream_iter->last_inactivity_ts > curr_msg_ts_ns`

While the idea behind the fix is valid, this statement compares
nanoseconds since Unix Epoch (former) to clock cycles (latter).

A conversion of the `last_inactivity_ts` to nanoseconds since Unix epoch
is performed using the stream's default clock class allowing a
comparison in a common time base to take place.

The diff looks more intimidating than it really is as it shifts the
indentation of a lot of code at once. This is because we only want to
perform the timestamp conversion when necessary (very rarely) on this
fairly hot path.

Reviewed-by: Francis Deslauriers <email address hidden>
Signed-off-by: Jérémie Galarneau <email address hidden>
Change-Id: Ibdc365fec4685da88ae141383d5e5ef0af169a87

969f43d... by Jérémie Galarneau <email address hidden> on 2020-05-27

Fix: src.ctf.lttng-live: incomplete metadata packet is an error

Observed issue
==============

While investigating the issue described in [1], I noticed that
babeltrace2 falls into a retry loop when the src.ctf.lttng-live
component encounters unparseable metadata.

The src.ctf.lttng-live reports parsing errors on every subsequent
reception of a metadata packet. The relay daemon eventually sends
binary data which fails to be decoded, ending the graph's execution
with a binary decoding error, which is not the "real" issue.

Cause
=====

Due to a (now fixed) bug in LTTng [1], unparseable (incomplete)
metadata can be made visible to live clients. This bug fix doesn't
involve the clients; it resulted in an illegal state in the lttng-live
protocol.

When the relay daemon notifies the live client that new metadata is
available, lttng_live_metadata_update() receives all the metadata made
available by the relay daemon in a memory stream and then uses the
ctf_metadata_decoder to append the new content to the existing trace
class.

However, if the decoder returns
`CTF_METADATA_DECODER_STATUS_INCOMPLETE`,
`LTTNG_LIVE_ITERATOR_STATUS_AGAIN` is returned to the caller,
resulting in the graph retrying to invoke the live iterator's `next`
method until another error eventually prevents the successful
completion of the graph.

I am assuming that the use of the `AGAIN` status code may have been a
failed attempt at fixing [1] in the live component rather than
adressing the underlying problem.

Solution
========

To my knowledge there are no provisions made for incomplete metadata
in the lttng-live protocol as of the current version. This may have
been done in anticipation of a future change (?), but it currently
obscures the error reported when a corrupted/incomplete/unparseable
metadata packet is received by the src.ctf.lttng-live component.

The current approach doesn't work anyhow as
lttng_live_metadata_update() creates a "fresh" memory stream on each
invocation. If the intention was to accumulate partial metadata until
it can be successfully parsed, the accumulated metadata would have to
be preserved from one invocation to the next.

The conversion of the status code from `INCOMPLETE` to `AGAIN` is
removed to fail immediately during the current invocation of the
iterator's `next` method.

Drawbacks
=========

None.

References
==========

[1] https://github.com/lttng/lttng-tools/commit/f5ba75b4f0c0b44092c76bc931b25b24a2e62718

Signed-off-by: Jérémie Galarneau <email address hidden>
Change-Id: I8a379ea5d838786a6731199dd5f03bbf70ec13f5
Reviewed-on: https://review.lttng.org/c/babeltrace/+/3586
Tested-by: jenkins <email address hidden>
Reviewed-by: Philippe Proulx <email address hidden>

141914f... by Jérémie Galarneau <email address hidden> on 2020-06-04

Fix: source.ctf.lttng-live: muxing failure on clear

Observed issue
==============

The lttng-tools port builds regularly fail on the CI with the following
error:

```
05-28 16:13:52.864 11219 11219 E PLUGIN/SRC.CTF.LTTNG-LIVE <email address hidden>:1067 [lttng-live] Message's timestamp is less than lttng-live's message iterator's last returned timestamp: lttng-live-msg-iter-addr=0xd810220dd30, ts=1590696831733594090, last-msg-ts=1590696832514058899
05-28 16:13:52.864 11219 11219 E PLUGIN/SRC.CTF.LTTNG-LIVE <email address hidden>:1502 [lttng-live] Error preparing the next batch of messages: live-iter-status=LTTNG_LIVE_ITERATOR_STATUS_ERROR
05-28 16:13:52.864 11219 11219 W LIB/MSG-ITER <email address hidden>:868 Component input port message iterator's "next" method failed: iter-addr=0xd810221c7c0, iter-upstream-comp-name="lttng-live", iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=SOURCE, iter-upstream-comp-class-name="lttng-live", iter-upstream-comp-class-partial-descr="Connect to an LTTng relay daemon", iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
05-28 16:13:52.864 11219 11219 W LIB/GRAPH <email address hidden>:466 Component's "consume" method failed: status=ERROR, comp-addr=0xd81022373a0, comp-name="sink.text.details", comp-log-level=WARNING, comp-class-type=SINK, comp-class-name="details", comp-class-partial-descr="Print messages with details.", comp-class-is-frozen=0, comp-class-so-handle-addr=0xd8102233ca0, comp-class-so-handle-path="/tmp/debug_jgalar/build/usr/lib/babeltrace2/plugins/babeltrace-plugin-text.so", comp-input-port-count=1, comp-output-port-count=0
05-28 16:13:52.865 11219 11219 E CLI <email address hidden>:2530 Graph failed to complete successfully

ERROR: [Babeltrace CLI] (babeltrace2.c:2530)
  Graph failed to complete successfully
CAUSED BY [libbabeltrace2] (graph.c:466)
  Component's "consume" method failed: status=ERROR, comp-addr=0xd81022373a0,
  comp-name="sink.text.details", comp-log-level=WARNING, comp-class-type=SINK,
  comp-class-name="details", comp-class-partial-descr="Print messages with
  details.", comp-class-is-frozen=0, comp-class-so-handle-addr=0xd8102233ca0,
  comp-class-so-handle-path="/tmp/debug_jgalar/build/usr/lib/babeltrace2/plugins/babeltrace-plugin-text.so",
  comp-input-port-count=1, comp-output-port-count=0
CAUSED BY [libbabeltrace2] (iterator.c:868)
  Component input port message iterator's "next" method failed:
  iter-addr=0xd810221c7c0, iter-upstream-comp-name="lttng-live",
  iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=SOURCE,
  iter-upstream-comp-class-name="lttng-live",
  iter-upstream-comp-class-partial-descr="Connect to an LTTng relay daemon",
  iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
CAUSED BY [lttng-live: 'source.ctf.lttng-live'] (lttng-live.c:1502)
  Error preparing the next batch of messages:
  live-iter-status=LTTNG_LIVE_ITERATOR_STATUS_ERROR
CAUSED BY [lttng-live: 'source.ctf.lttng-live'] (lttng-live.c:1067)
  Message's timestamp is less than lttng-live's message iterator's last returned
  timestamp: lttng-live-msg-iter-addr=0xd810220dd30, ts=1590696831733594090,
  last-msg-ts=1590696832514058899
```

The test that ends up with this error performs a session `clear`
while tracing an application as part of a "live" session.

Cause
=====

Given the following trace,

| pkt seq:0 |<-------discarded packets------>| pkt seq:8 |
0 20 121 140

a CTF source is expected to introduce a "Discarded Packets" message
between packets 0 and 8. The begin and end timestamps of such a message
are synthesized from the timestamps of the last known packet (seq:0)
and the newly decoded packet (seq:8).

For instance, here the Discarded Packets message would have the bounds
[pkt0.end_ts, pkt8.begin_ts].

In the context of a live source, the tracer could report an inactivity
period during the interval of the "Discarded packets" message.

Those live protocol messages eventually translate into an "Iterator
Inactivity" message with a timestamp set at the end of the inactivity
period.

If the tracer reports an inactivity period that ends at a point
between pkt0 and pkt8 (resulting in an "Iterator Inactivity" message),
the live source will send a "Discarded Packets" message that starts
before the preceding "Iterator Inactivity" message, breaking the
monotonicity constraint of the graph.

This happens regularly when `clear` is used on an LTTng live session as
it effectively discards packets, something that would otherwise not
happen (before LTTng 2.12) as most users set their tracing channels in
"discard" mode (the default for that session type).

However, the same problem as described above applies to "Discarded
Events" messages and could occur with older LTTng versions.

Solution
========

"Discarded Events" and Discarded Packets" messages are intercepted
in the lttng-live source's "muxer" and are _adjusted_ so that
they honor the monotonicity guarantee of the graph.

In effect, whenever such a message is seen to have a begin timestamp
that is _before_ the last inactivity timestamp, its begin timestamp
is adjusted to the last inactivity timestamp's value.

From a correctness standpoint, this is okay. If the tracer notified the
relay daemon of an inactivity period on a stream, we can rely on the
fact that no data was produced during that time to affirm that no
packets where lost during that time either.

A test reproducing the "Discarded Packets" scenatio described in
this message is added in a follow-up patch.

Known drawbacks
===============

None.

Signed-off-by: Jérémie Galarneau <email address hidden>
Change-Id: Ia8f32ba6717b33203637bf8d5aa34ff2a78c3e7f
Reviewed-on: https://review.lttng.org/c/babeltrace/+/3616
Tested-by: jenkins <email address hidden>

c0b480d... by Jérémie Galarneau <email address hidden> on 2020-06-01

Fix: source.ctf.fs: 0-length packet index length causes SIGFPE

A corrupted index can present a 0-length packet index length
which will result in a division by 0 when computing the index
entry count.

Program terminated with signal SIGFPE, Arithmetic exception.
 #0 0x00007f6ecbd44978 in build_index_from_idx_file (ds_file=0x561ade51ca00, file_info=0x561ade51d000,
    msg_iter=0x561ade51cd00) at data-stream-file.c:640
640 file_entry_count = (filesize - sizeof(*header)) / file_index_entry_size;

The index packet length is checked against the smallest valid size:
the size of an index entry as of the 1.0 CTF index version.

Signed-off-by: Jérémie Galarneau <email address hidden>
Change-Id: I83c705575d55f3b56ae413d1ce5ae0fc60121f2c
Reviewed-on: https://review.lttng.org/c/babeltrace/+/3606
Tested-by: jenkins <email address hidden>
Reviewed-by: Philippe Proulx <email address hidden>

71d8a83... by Jérémie Galarneau <email address hidden> on 2020-04-23

Update working version to Babeltrace 2.0.4

Signed-off-by: Jérémie Galarneau <email address hidden>
Change-Id: Idfcf7e0dfbe4841c4032b60b11748a8cfb744d0c

c85dbd4... by Jérémie Galarneau <email address hidden> on 2020-04-23

Release: Babeltrace 2.0.3 "Amqui"

Signed-off-by: Jérémie Galarneau <email address hidden>
Change-Id: Ib443f69b40fbfd04a779b31ecda70e25827d4e97

1e66c11... by Simon Marchi <email address hidden> on 2020-04-22

Fix: lib: use appropriate format specifier to print message iterator class

Running

  babeltrace2 --debug /some/trace

... results in the following crash:

    ==31705==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f5b9c25ae22 bp 0x7ffc64114490 sp 0x7ffc64113ba8 T0)
    ==31705==The signal is caused by a READ memory access.
    ==31705==Hint: address points to the zero page.
        #0 0x7f5b9c25ae21 (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xfce21)
        #1 0x7f5b9c1d231f (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x7431f)
        #2 0x7f5b9c1fef4d in __interceptor_vsnprintf (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xa0f4d)
        #3 0x7f5b9c1ff286 in snprintf (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xa1286)
        #4 0x7f5b9bdf61c3 in format_component_class /home/smarchi/src/babeltrace/src/lib/lib-logging.c:1061
        #5 0x7f5b9bdfad46 in handle_conversion_specifier_bt /home/smarchi/src/babeltrace/src/lib/lib-logging.c:1451
        #6 0x7f5b9bead260 in bt_common_custom_vsnprintf /home/smarchi/src/babeltrace/src/common/common.c:1727
        #7 0x7f5b9bdfb10b in bt_lib_log /home/smarchi/src/babeltrace/src/lib/lib-logging.c:1496
        #8 0x7f5b9be32965 in bt_message_iterator_class_set_seek_beginning_methods /home/smarchi/src/babeltrace/src/lib/graph/message-iterator-class.c:138
 ...

This is due to an object being printed with the wrong format specifier.
`C` is for component class objects, use `I` instead, since we are
printing a message iterator object.

Change-Id: Ie3985d8b7c4e02ac7d09aee84bfe46ea4bab87e9
Signed-off-by: Simon Marchi <email address hidden>
Reviewed-on: https://review.lttng.org/c/babeltrace/+/3367
CI-Build: Philippe Proulx <email address hidden>
Tested-by: jenkins <email address hidden>
Reviewed-by: Philippe Proulx <email address hidden>

5ca921c... by Simon Marchi <email address hidden> on 2020-04-14

Fix: sink.text.pretty: check that port is connected before creating message iterator

sink.text.pretty does not check if its input port is connected before
trying to create a message iterator on it. This can lead to a
precondition assertion failure. It can be reproduced with this Python
snippet.

    import bt2
    g = bt2.Graph()
    g.add_component(bt2.find_plugin('text').sink_component_classes['pretty'], 'snk')
    g.run()

The assertion failure we get is:

    04-14 11:35:27.339 1231815 1231815 F LIB/MSG-ITER <email address hidden>:295 Babeltrace 2 library precondition not satisfied; error is:
    04-14 11:35:27.339 1231815 1231815 F LIB/MSG-ITER <email address hidden>:295 Input port is not connected: port-addr=0x607000001d70, port-type=INPUT, port-name="in"
    04-14 11:35:27.339 1231815 1231815 F LIB/MSG-ITER <email address hidden>:295 Aborting...
    ./tests/utils/../utils/utils.sh: line 283: 1231815 Aborted (core dumped) env "${env_args[@]}" "$@"

Add a check and return an error if that happens instead. A
corresponding test is also added.

Change-Id: Ibeed94cd6ece543817fe8a765a69cb52bbaaba76
Signed-off-by: Simon Marchi <email address hidden>
Reviewed-on: https://review.lttng.org/c/babeltrace/+/3403
Reviewed-by: Philippe Proulx <email address hidden>
Tested-by: jenkins <email address hidden>
Reviewed-on: https://review.lttng.org/c/babeltrace/+/3411

2467456... by Simon Marchi <email address hidden> on 2020-03-26

Add initial Python bindings documentation

This initial documentation contains a home page, an installation page,
and a few examples to understand how the `bt2` package works.

Still missing: how exactly the bindings wrap libbabeltrace2 (wraping
rules, exceptions, etc.).

Changes:

`README.adoc`:
    Specify that you need Sphinx to build the Python bindings
    documentation.

`configure.ac` and `m4/check_sphinx.m4`:
    Add `--enable-python-bindings-doc` which requires
    `--enable-python-bindings`.

    This is because the Sphinx configuration file actually imports the
    `bt2` package to get the version (and, eventually, for Sphinx's
    autodoc to find docstrings within the `bt2` modules).

`doc/bindings/python/source`:
    The actual documentation's contents and configuration.

`doc/bindings/python/ext/bt2sphinxurl.py`:
    A Sphinx extension to add Babeltrace 2 manual page and other links
    of which the URL includes the project's version.

Signed-off-by: Simon Marchi <email address hidden>
Signed-off-by: Philippe Proulx <email address hidden>
Reviewed-on: https://review.lttng.org/c/babeltrace/+/3278

Backported from ba64dfcccb1f1bd7a259dc5d563ba422b8375582

Modifications:
 configure.ac:
   PPRINT_PROP_BOOL_CUSTOM -> PPRINT_PROP_BOOL
 examples.rst:
   Removed example usage for `in`:
      if 'next_comm' in event:
    This utility was introduced by
    f03b6364aec2d77bbb5ef0625cbaea8de4179f63, not backported,
    and only provides a shortcut to iterate over all `root` fields.
    User can implement their own as needed.
  Updated referenced version of lttng from 2.11 to "latest" for
  lttng-live url.

Signed-off-by: Jonathan Rajotte <email address hidden>
Change-Id: Id0f4636a5f66e98d383548bdcb894f53d31812b4
Reviewed-on: https://review.lttng.org/c/babeltrace/+/3408
Reviewed-by: Philippe Proulx <email address hidden>
Tested-by: jenkins <email address hidden>

31592d1... by Simon Marchi <email address hidden> on 2020-04-11

Fix: bt2: read properties on _DiscardedEventsMessage

Reading the count and clock snapshot properties on a _DiscardedEventsMessage
does not work. For example, this:

    msg = self._create_discarded_events_message(stream, count=10)
    print(msg.count)

Results in:

      File "test.py", line 10, in __init__
        print(msg.count)
      File "/home/simark/build/babeltrace/src/bindings/python/bt2/build/build_lib/bt2/message.py", line 208, in count
        avail, count = self._get_count(self._ptr)
    AttributeError: '_DiscardedEventsMessage' object has no attribute '_get_count'

The problem is simply that _DiscardedEventsMessage is missing inheriting
from _DiscardedEventsMessageConst.

Change-Id: Id40ea54bb26b4e4aec264d84bda52ed815090341
Signed-off-by: Simon Marchi <email address hidden>
Reviewed-on: https://review.lttng.org/c/babeltrace/+/3392
Tested-by: jenkins <email address hidden>
Reviewed-by: Philippe Proulx <email address hidden>
Reviewed-on: https://review.lttng.org/c/babeltrace/+/3365