dvbv5-zap crash (raspbian only)

Bug #1819650 reported by sami
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Raspbian
New
Undecided
Unassigned

Bug Description

Summary: killing dvbv5-zap on raspbian result in double free or corruption and may crash system

Steps to reproduce:
1) Have a valid channel.conf and a USB tuner attached
2) timeout 10 dvbv5-zap -c channels.conf -v --lna=-1 'TF1' -P -o - > /dev/null (or simply launch dvbv5-zap and hit Ctrl+C)

Result: *** Error in 'dvbv5-zap': double free or corruption (fasttop): 0x(hex adress) ***
3) If you want to completely crash your system (complete freeze without any further notice, no log, no screen error message, unresponsive system), run:
while true;do timeout 10 dvbv5-zap -c channels.conf -v --lna=-1 'TF1' -P -o - > /dev/null;done
and just wait. Sometimes it takes hours, sometimes it's only a matter of minutes
All those signals produce the same result: 2) SIGINT 9) SIGKILL 15) SIGTERM

1) I was able to reproduce this on raspbian with different combination of raspberry pi hardware and software (raspberry 2 model B and raspberry zeroW)
 * dvbv5-zap version 1.12.3: affected
 * dvbv5-zap version 1.16.3 from buster: affected
 * $ uname -a: Linux raspberrypi 4.14.98+ #1200 Tue Feb 12 20:11:02 GMT 2019 armv6l GNU/Linux: affected
 * $ uname -a: Linux pihole 4.19.25-v7+ #1205 SMP Mon Feb 25 18:19:20 GMT 2019 armv7l GNU/Linux: affected
2) I cannot reproduce it on
 * ubuntu 18.04 x64, dvbv5-zap version 1.14.2
 * pure debian 9 x86, dvbv5-zap version 1.12.3
3) I reproduced it with different two tuners: rtl2832U from RTL-SDR.COM and a TerraTec Cinergy T Stick+

--> This seems to lie somewhere between raspbian, arm builds and dvbv5-zap

Tags: buster
Revision history for this message
Gregor Jasny (gjasny) wrote :

Thanks for your report. Would you be able to compile v4l-utils by yourself with address sanitiser enabled? That might give the upstream author(s) a better idea of what went wrong. It might even show you something on x86/64.

apt-get install libudev-dev gettext libtool autoconf automake pkg-config # from top of my head

git clone git://linuxtv.org/v4l-utils.git
cd v4l-utils
./bootstrap.sh
# depending on your gcc version it might be beneficial to use clang and clang++ here
CFLAGS="-g -fsanitize=address" CXXFLAGS="-g -fsanitize=address" LDFLAGS="-fsanitize=address" ./configure --enable-static --disable-shared
make -j$(nproc)
utils/dvb/dvbv5-zap

Thanks!
Gregor

Revision history for this message
sami (miaousami) wrote :

Hi,

I wasn't able to compile with fsanitize=address CFLAGS or LDFLAGS.
As soon as I add those flags, I am getting:
  checking whether we are cross compiling... configure: error: in `/home/pi/v4l-utils':
  configure: error: cannot run C compiled programs.
  If you meant to cross compile, use `--host'.

And in config.log:
  configure:3245: gcc -o conftest -g -fsanitize=address -fsanitize=address conftest.c >&5
  configure:3249: $? = 0
  configure:3256: ./conftest
  ==6703==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.
  configure:3260: $? = 1
  configure:3267: error: in `/home/pi/v4l-utils':
  configure:3269: error: cannot run C compiled programs.

I am **not** cross compiling, I am compiling on the raspberry pi directly.

Anyway, I was able to compile with debug flags:
CFLAGS="-g" LDFLAGS="-g" CXXFLAGS="-g -fsanitize=address" ./configure --enable-static --disable-shared

On git master, I do not experience the issue anymore.
* Either it has been fixed between 1.16.3 and master,
* or I am not compiling it the same way debian did.

I'll checkout and compile 1.16.3 and tell you the result (just be patient, raspberry compiling is slow...)

Revision history for this message
sami (miaousami) wrote :

I also tried a
$ valgrind --leak-check=full --show-leak-kinds=all dvbv5-scan

but I'm getting troubles with valgrind (opened a report here: https://bugs.launchpad.net/raspbian/+bug/1819770)

Revision history for this message
sami (miaousami) wrote :

I recompiled from v4l-utils-1.16.3 tag, with CFLAGS="-g" LDFLAGS="-g" CXXFLAGS="-g -fsanitize=address" ./configure --enable-static --disable-shared, and couldn't reproduce the issue

$ git status
HEAD detached at v4l-utils-1.16.3

This leaves us with compiler options or version different from debian package.

I can try to compile without any flag, but I'm not really sure about --enable-static --disable-shared

What do you think?
What are defaut debian compile flags?
Is it cross compiled?

Revision history for this message
sami (miaousami) wrote :

I recompiled from v4l-utils-1.16.3 tag with a plain old ./configure, no specific flags, and I cannot reproduce the problem.

$ gcc --version
gcc (Raspbian 6.3.0-18+rpi1+deb9u1) 6.3.0 20170516

I don't know how to investigate further.
Please let me know what should I do.

Revision history for this message
Gregor Jasny (gjasny) wrote :

Could you please try clang as a compiler?

CC=clang CXX=clang++

Revision history for this message
Gregor Jasny (gjasny) wrote :

If you're able to find the raspbian build logs you could have a look at the build flags. Debian ones are here: https://buildd.debian.org/status/package.php?p=v4l-utils

From a glimpse at those they seem to enable `-D_FORTIFY_SOURCE=2 -g -O2 -fstack-protector-strong` Maybe those make a difference. But ASAN also should tell you about broken stacks.

Revision history for this message
sami (miaousami) wrote :

Hi,

Using clang:
$ clang --version
clang version 3.8.1-24+rpi1 (tags/RELEASE_381/final)

with:
CC=clang CXX=clang++ CFLAGS="-g -fsanitize=address" CXXFLAGS="-g -fsanitize=address" LDFLAGS="-fsanitize=address" ./configure --enable-static --disable-shared

on 1.16.3:
$ git status
HEAD detached at v4l-utils-1.16.3

then it wont compile because of:
Making all in ir-ctl
make[3]: Entering directory '/home/pi/v4l-utils/utils/ir-ctl'
  CC ir-ctl.o
  CC ir-encode.o
ir-encode.c:37:2: error: function definition is not allowed here
        {
        ^
ir-encode.c:55:3: warning: implicit declaration of function 'add_byte' is invalid in C99 [-Wimplicit-function-declaration]
                add_byte(scancode >> 8);
                ^
ir-encode.c:111:2: error: function definition is not allowed here
        {
        ^
ir-encode.c:126:2: warning: implicit declaration of function 'add_bits' is invalid in C99 [-Wimplicit-function-declaration]
        add_bits(scancode >> 8, 13);
        ^
ir-encode.c:141:2: error: function definition is not allowed here
        {
        ^
ir-encode.c:174:2: error: function definition is not allowed here
        {
        ^
ir-encode.c:217:2: error: function definition is not allowed here
        {
        ^
ir-encode.c:225:2: error: function definition is not allowed here
        {
        ^
ir-encode.c:233:2: error: function definition is not allowed here
        {
        ^
ir-encode.c:266:3: warning: implicit declaration of function 'advance_space' is invalid in C99 [-Wimplicit-function-declaration]
                advance_space(NS_TO_US(rc5_unit * 4));
                ^
ir-encode.c:282:2: error: function definition is not allowed here
        {
        ^
ir-encode.c:290:2: error: function definition is not allowed here
        {
        ^
ir-encode.c:298:2: error: function definition is not allowed here
        {
        ^
3 warnings and 10 errors generated.

Is there a way to compile only dvbv5-zap and it's dependencies?
Or is my clang too old?

Revision history for this message
Gregor Jasny (gjasny) wrote :

If you run make -k it should continue as far as it can.

Revision history for this message
Gregor Jasny (gjasny) wrote :

The code does not compile because it uses nested functions, something clang does not support:

https://gcc.gnu.org/onlinedocs/gcc/Nested-Functions.html
https://clang.llvm.org/docs/UsersManual.html#id56

Revision history for this message
sami (miaousami) wrote :

The build still fails but after compiling dvbv5-zap.
So I could test and I cannot reproduce the issue (like with my own builds with gcc)...

Weird...

Please find the build log attached, I am not C developper and can't tell you if something is wrong with it or not (apart from all build errors)

I'm now trying to build the debian package from source to see if I can reproduce the error with package build flags...

Revision history for this message
sami (miaousami) wrote :

I was able to rebuild the debian package:
Add sources to /etc/apt/sources.list
sudo apt update
apt-get source dvb-tools
sudo apt-get build-dep dvb-tools
cd v4l-utils-1.12.3/
sudo apt-get install devscripts build-essential lintian
debuild -us -uc
cd ..
sudo apt purge dvb-tools
sudo apt autoremove
sudo dpkg -i libdvbv5-0_1.12.3-1_armhf.deb
sudo dpkg -i dvb-tools_1.12.3-1_armhf.deb

Now, When I run:
dvbv5-zap -c channels.conf -v --lna=-1 'TF1' -P -o - > /dev/null
and hit Ctrl+C after a few seconds, I've got a slightly different error:
$ dvbv5-zap -c channels.conf -v --lna=-1 'TF1' -P -o - > /dev/null
using demux 'dvb0.demux0'
reading channels from file 'channels.conf'
tuning to 562000000 Hz
pass all PID's to TS
  dvb_set_pesfilter 8192
       (0x00)
Lock (0x1f) Signal= 92.55% C/N= 29.78dB postBER= 0
Lock (0x1f) Signal= 92.94% C/N= 30.47dB postBER= 0
received 0 bytes
Lock (0x1f) Signal= 92.94% C/N= 31.77dB postBER= 0
free(): double free detected in tcache 2
Aborted

There is a double free, but it is not exactly the same message...

Revision history for this message
sami (miaousami) wrote :

I also compiled with:
DEB_BUILD_OPTIONS='nostrip noopt debug' dpkg-buildpackage -uc -us

The double free is still here, but I'm not able to get a core dump.
I tried kill -11, it produce "Segmentation fault", but no core dump file in current directory...

How can I get a core dump? Would it be helpful?

Revision history for this message
sami (miaousami) wrote :

I recompile, but from buster package source, with:
DEB_BUILD_OPTIONS='nostrip noopt debug' dpkg-buildpackage -uc -us

I cannot reproduce the bug...
Next try will be buster source, without debug flags...

Revision history for this message
sami (miaousami) wrote :

Well, event if the debug build wasn't producing errors, the:
while true;do timeout 10 dvbv5-zap -c channels.conf -v --lna=-1 'TF1' -P -o - > /dev/null;done

ended in killing my system... :-(

Revision history for this message
sami (miaousami) wrote :

I recompiled again, from buster package source, without debug.
I cannot reproduce the bug...

I did reinstall the binary package from buster, and cannot reproduce the bug... weird...
I just don't know what to do now...

:-/

Revision history for this message
sami (miaousami) wrote :

I succeed in compiling 1.16.3 on my x64 computer, with ASAN.
Here is valgrind result:

$ valgrind --leak-check=full --show-leak-kinds=all timeout 10 ./dvbv5-zap -c channels.conf -v --lna=-1 'TF1' -P -o - > /dev/null
==5406== Memcheck, a memory error detector
==5406== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==5406== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==5406== Command: timeout 10 ./dvbv5-zap -c channels.conf -v --lna=-1 TF1 -P -o -
==5406==
Couldn't find demux device node

=================================================================
==5407==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 14 byte(s) in 1 object(s) allocated from:
    #0 0x7fcd7773b538 in strdup (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x77538)
    #1 0x55bc65d7cd2c in parse_opt /home/sam/tmp/v4l-utils/utils/dvb/dvbv5-zap.c:640
    #2 0x7fcd76e46c3a in argp_parse (/lib/x86_64-linux-gnu/libc.so.6+0x12fc3a)
    #3 0x55bc65d803b8 in main /home/sam/tmp/v4l-utils/utils/dvb/dvbv5-zap.c:1056
    #4 0x7fcd76d38b96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)

Direct leak of 12 byte(s) in 1 object(s) allocated from:
    #0 0x7fcd777a2b50 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xdeb50)
    #1 0x7fcd76d9f61f in vasprintf (/lib/x86_64-linux-gnu/libc.so.6+0x8861f)

Direct leak of 2 byte(s) in 1 object(s) allocated from:
    #0 0x7fcd7773b538 in strdup (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x77538)
    #1 0x55bc65d7cc1c in parse_opt /home/sam/tmp/v4l-utils/utils/dvb/dvbv5-zap.c:628
    #2 0x7fcd76e46c3a in argp_parse (/lib/x86_64-linux-gnu/libc.so.6+0x12fc3a)
    #3 0x55bc65d803b8 in main /home/sam/tmp/v4l-utils/utils/dvb/dvbv5-zap.c:1056
    #4 0x7fcd76d38b96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)

SUMMARY: AddressSanitizer: 28 byte(s) leaked in 3 allocation(s).
==5406==
==5406== HEAP SUMMARY:
==5406== in use at exit: 8 bytes in 1 blocks
==5406== total heap usage: 31 allocs, 30 frees, 4,073 bytes allocated
==5406==
==5406== 8 bytes in 1 blocks are definitely lost in loss record 1 of 1
==5406== at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5406== by 0x4E40381: timer_create@@GLIBC_2.3.3 (timer_create.c:59)
==5406== by 0x10A74B: ??? (in /usr/bin/timeout)
==5406== by 0x10A386: ??? (in /usr/bin/timeout)
==5406== by 0x5284B96: (below main) (libc-start.c:310)
==5406==
==5406== LEAK SUMMARY:
==5406== definitely lost: 8 bytes in 1 blocks
==5406== indirectly lost: 0 bytes in 0 blocks
==5406== possibly lost: 0 bytes in 0 blocks
==5406== still reachable: 0 bytes in 0 blocks
==5406== suppressed: 0 bytes in 0 blocks
==5406==
==5406== For counts of detected and suppressed errors, rerun with: -v
==5406== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

It seems there are some memory leaks...

Revision history for this message
Sean Young (sean-young) wrote :

pi@raspberrypi:~/dtv-scan-tables/dvb-t $ valgrind dvbv5-zap -c dvb_channel.conf -v --lna=-1 'CBBC' -P -o - > /dev/null
==2702== Memcheck, a memory error detector
==2702== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==2702== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==2702== Command: dvbv5-zap -c dvb_channel.conf -v --lna=-1 CBBC -P -o -
==2702==
using demux 'dvb0.demux0'
reading channels from file 'dvb_channel.conf'
service has pid type 05: 7103
tuning to 490000000 Hz
pass all PID's to TS
  dvb_set_pesfilter 8192
==2702== Warning: noted but unhandled ioctl 0x6f2d with no size/direction hints.
==2702== This could cause spurious value errors to appear.
==2702== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
Lock (0x1f) Quality= Good Signal= -45.74dBm C/N= 0.12dB UCB= 30 postBER= 2.93Lock (0x1f) Quality= Good Signal= -46.07dBm C/N= 0.12dB UCB= 30 postBER= 0 PER= 43.9x10^-6
Lock (0x1f) Quality= Good Signal= -46.03dBm C/N= 0.12dB UCB= 30 postBER= 0 PER= 43.9x10^-6
Record to file '-' started
^Creceived 28375404 bytes
Lock (0x1f) Quality= Poor Signal= -46.11dBm C/N= 0.03dB UCB= 4294967266 postBER= 0 PER= 33.7x10^3
==2702== Invalid free() / delete / delete[] / realloc()
==2702== at 0x4848B8C: free (vg_replace_malloc.c:530)
==2702== by 0x491A717: free_dvb_dev (in /usr/lib/arm-linux-gnueabihf/libdvbv5.so.0.0.0)
==2702== Address 0x4b7d248 is 0 bytes inside a block of size 28 free'd
==2702== at 0x4848B8C: free (vg_replace_malloc.c:530)
==2702== by 0x491C7B7: dvb_v5_free (in /usr/lib/arm-linux-gnueabihf/libdvbv5.so.0.0.0)
==2702== Block was alloc'd at
==2702== at 0x4847568: malloc (vg_replace_malloc.c:299)
==2702== by 0x49E6933: strdup (strdup.c:42)
==2702== by 0x491BDD3: ??? (in /usr/lib/arm-linux-gnueabihf/libdvbv5.so.0.0.0)
==2702==
==2702==
==2702== HEAP SUMMARY:
==2702== in use at exit: 8,194 bytes in 3 blocks
==2702== total heap usage: 1,407 allocs, 1,405 frees, 997,100 bytes allocated
==2702==
==2702== LEAK SUMMARY:
==2702== definitely lost: 2 bytes in 1 blocks
==2702== indirectly lost: 0 bytes in 0 blocks
==2702== possibly lost: 0 bytes in 0 blocks
==2702== still reachable: 8,192 bytes in 2 blocks
==2702== suppressed: 0 bytes in 0 blocks
==2702== Rerun with --leak-check=full to see details of leaked memory
==2702==
==2702== For counts of detected and suppressed errors, rerun with: -v
==2702== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 6 from 3)

Revision history for this message
Sean Young (sean-young) wrote :

The invalid free seems to occur AFTER the ^C or when timeout sends a signal. So the issue is on shutdown, so it does not affect functionality of dvbv5-zap.

This version of dvbv5-zap is ancient: 1.12.3.

I cannot reproduce the issue in 1.16.3, which is the current version.

Revision history for this message
Sean Young (sean-young) wrote :

Looks like commit 6e21f6f34c1d7c3a7a059062e1ddd9705c984e2c fixed it.

Revision history for this message
sami (miaousami) wrote :

Hi Sean,
thanks for your feedback.
Did you also reproduce random hangs?

In the end there are two issues:
1) The double free from 1.12.3: as you said, it seems fixed in 1.16.3. Do you think it can be backported for debian 9?

2) The random crashes that **seem** to happen only on ARM/raspberry: Mauro gave me a patch that I'm currently testing (so far, after a few hours, no crash. I hope it'll be ok tonight). If it's okay then we'll have to figure if it can be backported to debian 9 and debian 10...

Revision history for this message
Gregor Jasny (gjasny) wrote :

Hello Sean,

Andre Rhode's patch looks incomplete. The patched code path for dvb_fe_open_fname looks like this:

int dvb_fe_open_fname(struct dvb_v5_fe_parms_priv *parms, char *fname,
        int flags)
{
 struct dtv_properties dtv_prop;
 int fd, i;

 fd = open(fname, flags, 0);
 if (fd == -1) {
  dvb_logerr(_("%s while opening %s"), strerror(errno), fname);
  free(fname);
  return -errno;
 }

 if (xioctl(fd, FE_GET_INFO, &parms->p.info) == -1) {
  dvb_perror("FE_GET_INFO");
  close(fd);
  return -errno;
 }

Shouldn't the free in the open-error path be removed as well?

Revision history for this message
Gregor Jasny (gjasny) wrote :

Sami, once I'm happy with the patches I'll file a request with the Debian release team to accept the patched for Debian Stretch.

Revision history for this message
Sean Young (sean-young) wrote :

Hello Gregor,

Good spot. Actually, there are multiple mistakes there. I've sent a patch to <email address hidden> for this.

Let's wait for review comments, it also needs some more testing before merging.

Thanks
Sean

Revision history for this message
sami (miaousami) wrote :

Hi guys,
thanks for your help!

Mauro upstreamed his fix for random crash on master [22b06353227e04695b1b0a9622b896b948adba89](https://git.linuxtv.org/v4l-utils.git/commit/?id=22b06353227e04695b1b0a9622b896b948adba89) and backported it to branches stable-1.16 and stable-1.12.

The double free was also backported to stable-1.12: https://git.linuxtv.org/v4l-utils.git/commit/?h=stable-1.12

I'm currently retesting against stable-1.12, and then I'll retest against stable-1.16. It just takes some time because I let the while...loop run for at least 24Hrs.

I didn't experienced the crash anymore yet...

I also subscribed to <email address hidden> to follow Sean fixes...

:-)

Revision history for this message
sami (miaousami) wrote :

Hi!

both stable-1.12 and stable-1.16 are patched and tested OK.

Can you update/rebuild debian stretch and buster?

Will raspbian get automatically updated? Or is there something to do on their side?

Regards

Revision history for this message
sami (miaousami) wrote :

There are still random hangs cases, the discussion is ongoing there: https://<email address hidden>/msg145702.html

Pander (pander)
tags: added: buster
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.