Qt on i.MX6 with -platform eglfs -> Segmentation fault - qt

I have cross-compiled Qt 5.1.1 for an i.MX6 powered Nitrogen6x board running Debian 7 (wheezy).
I have configured Qt with the -egl parameter and eglfs has been listed as QPA backend in the configure output.
However if I try to run a small example application with the -platform eglfs parameter I am running into this error:
stdin: is not a tty
[ 1] HAL user version 4.6.9 build 6622 Aug 15 2013 13:22:40
[ 2] HAL kernel version 4.6.9 build 1210
QML debugging is enabled. Only use this in a safe environment.
bash: line 1: 3673 Segmentation fault DISPLAY=:0.0 /opt/Test/bin/Test -platform eglfs
Remote application finished with exit code 139.
OpenGL ES2 and EGL are installed on the board and can be found in /usr/lib and /usr/include.
Sadly I couldn't find proper documentation for eglfs, so I am hoping that someone around here has made some experiences with it.
This is the backtrace output:
run Test-platform eglfs
Starting program: /opt/Test/bin/Test Test -platform eglfs
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
[ 1] HAL user version 4.6.9 build 6622 Aug 15 2013 13:31:17
[ 2] HAL kernel version 4.6.9 build 1210
QML debugging is enabled. Only use this in a safe environment.
[New Thread 0x2c6b7460 (LWP 4057)]
Program received signal SIGSEGV, Segmentation fault.
0x2bab6f48 in gcoHAL_QueryChipCount () from /usr/lib/libGAL.so
(gdb) backrace full
Undefined command: "backrace". Try "help".
(gdb) backrace full[1#t
#0 0x2bab6f48 in gcoHAL_QueryChipCount () from /usr/lib/libGAL.so
No symbol table info available.
#1 0x2ba7ccbc in veglGetThreadData () from /usr/lib/libEGL.so.1
No symbol table info available.
#2 0x2ba74cd0 in eglBindAPI () from /usr/lib/libEGL.so.1
No symbol table info available.
#3 0x2be41934 in ?? () from /usr/local/Qt-Debian/plugins/platforms/libqeglfs.so
No symbol table info available.
#4 0x2be41934 in ?? () from /usr/local/Qt-Debian/plugins/platforms/libqeglfs.so
No symbol table info available.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) info registers
r0 0x1 1
r1 0x23e54 147028
r2 0x738 1848
r3 0x0 0
r4 0x2bb67d84 733379972
r5 0x23e18 146968
r6 0x2e70c 190220
r7 0x2b430198 725811608
r8 0x7efff9e8 2130704872
r9 0x8 8
r10 0x2b0725c4 721888708
r11 0x7efffae0 2130705120
r12 0x2bab6f1c 732655388
sp 0x7efff8f0 0x7efff8f0
lr 0x2ba7ccbc 732417212
pc 0x2bab6f48 0x2bab6f48 <gcoHAL_QueryChipCount+44>
cpsr 0x80000010 -2147483632
(gdb) x/16i $pc
=> 0x2bab6f48 <gcoHAL_QueryChipCount+44>: ldr r3, [r3, #12]
0x2bab6f4c <gcoHAL_QueryChipCount+48>: sub r2, r3, #1
0x2bab6f50 <gcoHAL_QueryChipCount+52>: cmp r2, #2
0x2bab6f54 <gcoHAL_QueryChipCount+56>: bhi 0x2bab6f70 <gcoHAL_QueryChipCount+84>
0x2bab6f58 <gcoHAL_QueryChipCount+60>: ldr r2, [r4]
0x2bab6f5c <gcoHAL_QueryChipCount+64>: mov r0, #0
0x2bab6f60 <gcoHAL_QueryChipCount+68>: str r3, [r1]
0x2bab6f64 <gcoHAL_QueryChipCount+72>: add r3, r2, #1
0x2bab6f68 <gcoHAL_QueryChipCount+76>: str r3, [r4]
0x2bab6f6c <gcoHAL_QueryChipCount+80>: pop {r4, pc}
0x2bab6f70 <gcoHAL_QueryChipCount+84>: mvn r0, #8
0x2bab6f74 <gcoHAL_QueryChipCount+88>: bl 0x2baad5fc
0x2bab6f78 <gcoHAL_QueryChipCount+92>: ldr r3, [r4]
0x2bab6f7c <gcoHAL_QueryChipCount+96>: mvn r0, #8
0x2bab6f80 <gcoHAL_QueryChipCount+100>: add r3, r3, #1
0x2bab6f84 <gcoHAL_QueryChipCount+104>: str r3, [r4]
(gdb) thread apply all backtrace
Thread 2 (Thread 0x2c6b7460 (LWP 4057)):
#0 0x2b52ef96 in ?? () from /lib/arm-linux-gnueabihf/libc.so.6
#1 0x2b568634 in _IO_file_close () from /lib/arm-linux-gnueabihf/libc.so.6
#2 0x2b568ffe in _IO_file_close_it () from /lib/arm-linux-gnueabihf/libc.so.6
#3 0x2b56113a in fclose () from /lib/arm-linux-gnueabihf/libc.so.6
#4 0x2bea8d00 in udev_new () from /lib/arm-linux-gnueabihf/libudev.so.0
#5 0x2be7d2e4 in ?? () from /usr/local/Qt-Debian/plugins/platforms/libqeglfs.so
#6 0x2be7d2e4 in ?? () from /usr/local/Qt-Debian/plugins/platforms/libqeglfs.so
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Thread 1 (Thread 0x2bcb9220 (LWP 4056)):
#0 0x2bab6f48 in gcoHAL_QueryChipCount () from /usr/lib/libGAL.so
#1 0x2ba7ccbc in veglGetThreadData () from /usr/lib/libEGL.so.1
#2 0x2ba74cd0 in eglBindAPI () from /usr/lib/libEGL.so.1
#3 0x2be41934 in ?? () from /usr/local/Qt-Debian/plugins/platforms/libqeglfs.so
#4 0x2be41934 in ?? () from /usr/local/Qt-Debian/plugins/platforms/libqeglfs.so
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) quit
How could I possibly fix that error?

I have the exact same crash on a MarSBoard running an egl fb application on a Yocto image created with recipes from https://github.com/silmerusse/meta-robomind.
I had to copy the EGL/OpenGL related stuff from http://repository.timesys.com/buildsources/g/gpu-viv-bin-mx6q/.
In my case galcore.ko is builtin.
Edit:
Check that you have /dev/galcore and that its permission are crw.rw.rw. (otherwise sudo chmod 666 /dev/galcore).
If you don't have /dev/galcore, try insmod /lib/modules/..../kernel/drivers/mxc/gpu-viv/galcore.ko.
These steps fixed the crash for me on an ubuntu image.
On the Yocto image the galcore driver is builtin, and seems to be there but I still get the crash.
Edit:
The crash in the Yocto image was caused by the wrong version of the EGL/GAL.so libs. Apparently the galcore driver built into the kernel has version 4.6.9.6622. It requires libs from gpu-viv-bin-mx6q-3.0.35-4.1.0. Using those libs and manually copying them into /usr/lib my fb application runs fine, using hardware OpenGLES 2.0 and hardware decoding of a h264 video.

I fixed this issue by switching to Yocto and therefor gaining access to most recent releases of essential components.
If you're developing for an i.MX cpu I strongly recommend to have a look at https://github.com/Freescale/fsl-community-bsp-platform
It is very important to remove "x11" from default-distrovars and "wayland" from poky.conf as these will cause you to run into errors.
Building Qt5 on such a setup works fine.

I got similar segfault when I forgot to load the galcore module. Here is the backtrace:
#0 0x766062b0 in gcoHAL_QueryChipCount (Hal=Hal#entry=0x0, Count=Count#entry=0x16494)
at gc_hal_user_query.c:1726
#1 0x766da244 in veglGetThreadData () at gc_egl.c:137
#2 0x766d3210 in eglfGetDisplay (display_id=0x16c08) at gc_egl_init.c:464
#3 eglGetDisplay (DisplayID=0x16c08) at gc_egl_init.c:565
Qt 5.3.2, kernel 3.10.17, Galcore version 4.6.9.9754

Related

Recovering line of code that caused kernel panic

I'm running CentOS 8.1 and my machine has kernel panic'd. I installed the kernel-debuginfo package and I am generally following the steps in Section 7.11 : Analyzing a core dump.
Here is an abbreviated version of my debugging session :
# crash /usr/lib/debug/usr/lib/modules/4.18.0-147.el8.x86_64/vmlinux /var/crash/XXX/vmcore
.
.
.
WARNING: kernel relocated [336MB]: patching 93296 gdb minimal_symbol values
KERNEL: /usr/lib/debug/usr/lib/modules/4.18.0-147.el8.x86_64/vmlinux
DUMPFILE: /var/crash/XXX/vmcore [PARTIAL DUMP]
CPUS: 48
DATE: Sun Jan 10 13:36:04 2021
UPTIME: 23 days, 22:18:40
LOAD AVERAGE: 10.00, 10.01, 10.00
TASKS: 1966
NODENAME: YYY
RELEASE: 4.18.0-147.el8.x86_64
VERSION: #1 SMP Wed Dec 4 21:51:45 UTC 2019
MACHINE: x86_64 (2794 Mhz)
MEMORY: 2035.9 GB
PANIC: "Kernel panic - not syncing: Hard LOCKUP"
PID: 27666
COMMAND: "R"
TASK: ffff8ff017978000 [THREAD_INFO: ffff8ff017978000]
CPU: 2
STATE: TASK_RUNNING (PANIC)
crash> bt
PID: 27666 TASK: ffff8ff017978000 CPU: 2 COMMAND: "R"
#0 [ffffb5187c6c7a50] machine_kexec at ffffffff96057c4e
#1 [ffffb5187c6c7aa8] __crash_kexec at ffffffff96155b8d
#2 [ffffb5187c6c7b70] panic at ffffffff960b0578
#3 [ffffb5187c6c7bf8] watchdog_overflow_callback.cold.8 at ffffffff9618bb11
#4 [ffffb5187c6c7c08] __perf_event_overflow at ffffffff961f54f2
#5 [ffffb5187c6c7c38] x86_pmu_handle_irq at ffffffff96007a16
#6 [ffffb5187c6c7e88] amd_pmu_handle_irq at ffffffff96008b14
#7 [ffffb5187c6c7ea0] perf_event_nmi_handler at ffffffff960060cd
#8 [ffffb5187c6c7eb8] nmi_handle at ffffffff96021843
#9 [ffffb5187c6c7f10] default_do_nmi at ffffffff96021cce
#10 [ffffb5187c6c7f30] do_nmi at ffffffff96021ea8
#11 [ffffb5187c6c7f50] nmi at ffffffff96a01537
RIP: 0000146816a69d6e RSP: 00007ffc378f6270 RFLAGS: 00000216
RAX: 000000006655e8df RBX: 0000000000000003 RCX: 000000000003ad90
RDX: 0000000000000003 RSI: 0000000007ccb060 RDI: 000000076fe90140
RBP: 0000000000085fc6 R8: 0000000000b13039 R9: 00000007770c0758
R10: 0000000778ba8a38 R11: 0000000000000c36 R12: 000000000070e070
R13: 0000000000000000 R14: 00007ffc378f6410 R15: 00007ffc378f6430
ORIG_RAX: ffffffffffffffff CS: 0033 SS: 002b
Clearly the culprit is an R process which I was running (see COMMAND: "R"). Looking at the bt above, it seems like it is only returning the kernel level functions. I want to know what line in my R code (or installed R libraries) causing the issue. Trying
crash> gdb bt
No stack.
gdb: gdb request failed: bt
crash>
is not useful. Looking at man crash it isn't directly obvious how to do that. Some of the R libraries possibly involved have C++ code that was compiled with debug flags. I have a vague hope of the cause of the error being in one of these.
QUESTION :
How do I use the Linux crash utility to recover the line of code (either R or C++) that caused the kernel to panic?
What is a "Kernel panic - not syncing: Hard LOCKUP"

Video play on arm-Preemption disabled at <>

I am currently using an ARM Freescale processor board to perform video playback:
When I use the Qt Multimedia libraries over the application I get a Preemption before the video playback freezes:
root#sanuser:~# /home/app.sh start
====== AIUR: 4.1.4 build on Mar 2 2018 10:49:12. ======
Core: MPEG4PARSER_06.09.36 build on Aug 23 2016 05:18:47
file: /usr/lib/imx-mm/parser/lib_mp4_parser_arm11_elinux.so.3.2
------------------------
Track 00 [video_0] Enabled
Duration: 0:02:07.280000000
Language: und
Mime:
video/x-h264, parsed=(boolean)true, alignment=(string)au, stream-format=(string)avc, width=(int)640, height=(int)480, framerate=(fraction)25/1, codec_data=(buffer)014d401effe10018674d401eda0280f6c044000003000400000300c83c58ba8001000468ef3c80
------------------------
------------------------
Track 01 [audio_0] Enabled
Duration: 0:02:07.317000000
Language: und
Mime:
audio/mpeg, mpegversion=(int)4, channels=(int)6, rate=(int)48000, bitrate=(int)384039, stream-format=(string)raw, codec_data=(buffer)11b0
------------------------
====== BEEP: 4.1.4 build on Mar 2 2018 10:49:24. ======
Core: AAC decoder Wrapper build on May 30 2016 12:33:44
file: /usr/lib/imx-mm/audio-codec/wrap/lib_aacd_wrap_arm12_elinux.so.3
CODEC: BLN_MAD-MMCODECS_AACD_ARM_03.09.00_CORTEX-A8 build on Jul 13 2016 18:15:25.
[INFO] bitstreamMode 1, chromaInterleave 0, mapType 0, tiled2LinearEnable 0
BUG: scheduling while atomic: vqueue:src/642/0x00000101
Preemption disabled at:[<80101544>] __do_softirq+0x5c/0x3b4
However the video play back is smooth while using gstreamer pipeline without any issues:
gst-launch-1.0 -v filesrc location=/home/SampleVideo_720x480.mp4 typefind=true ! qtdemux !queue max-size-time=0 !vpudec ! videoconvert ! imxipuvideosink framebuffer=/dev/fb0
Qt multimedia system happens to trigger a kernel bug. It may be bug in Qt too, but no userspace application should be triggering a kernel panic. And that's what BUG: scheduling while atomic is. It's a driver bug somewhere in the kernel code, possibly the video driver for the board. Probably gstreamer has a workaround for that kernel bug, or does something differently and the kernel bug isn't hit. But it's definitely a kernel bug, no doubt about it.

Are there any limitations to starting multiple grpc::ServerBuilders in an application?

My C++ application running on Ubuntu 14.04 is having problems. I am using grpc to communicate with a go webserver application which is servering up webpages with status/configuration of the c++ application.
I have been using 1 year old version of grpc 0.14 something so before posting here, I upgraded everything (grpc 1.3.1, go version 1.8.1).
It seems my c++ application is crashing quite often with the 1.3.1(and with 1.0.0, 1.2.5, 1.2.0, etc...) grpc version.
I am getting a sigabort with a double free warning. The application will run for awhile but after a period of time in which the web application is requesting data from the c++ application, it will crash: gdb output:
[New LWP 9908]
[New LWP 9881]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `./bhio'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 _int_malloc (av=0x7fb3ac000020, bytes=16) at malloc.c:3351
3351 malloc.c: No such file or directory.
(gdb) where
#0 _int_malloc (av=0x7fb3ac000020, bytes=16) at malloc.c:3351
#1 0x00007fb4205db6c0 in __GI___libc_malloc (bytes=16) at malloc.c:2891
#2 0x000000000076bb2f in gpr_malloc ()
#3 0x000000000077678d in grpc_error_create ()
#4 0x000000000078ba94 in ?? ()
#5 0x000000000078dbee in grpc_chttp2_fail_pending_writes ()
#6 0x000000000078e19f in grpc_chttp2_mark_stream_closed ()
#7 0x000000000078e2eb in grpc_chttp2_cancel_stream ()
#8 0x000000000078ef1c in ?? ()
#9 0x000000000077597e in grpc_combiner_continue_exec_ctx ()
#10 0x0000000000777678 in grpc_exec_ctx_flush ()
#11 0x000000000078095f in grpc_call_cancel_with_status ()
#12 0x0000000000780be1 in grpc_call_destroy ()
#13 0x0000000000769bd7 in grpc::ServerContext::~ServerContext() ()
#14 0x0000000000768c7c in grpc::Server::SyncRequest::CallData::~CallData() ()
#15 0x00000000007691e3 in
grpc::Server::SyncRequestThreadManager::DoWork(void*, bool) ()
#16 0x000000000076aff1 in grpc::ThreadManager::MainWorkLoop() ()
#17 0x000000000076b04c in grpc::ThreadManager::WorkerThread::Run() ()
#18 0x00007fb420eebbf0 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#19 0x00007fb421146184 in start_thread (arg=0x7fb3ca686700)
at pthread_create.c:312
#20 0x00007fb42065337d in clone ()
---Type <return> to continue, or q <return> to quit---
at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb)
(gdb) quit
or here:
[Thread 0x7fff7d7fa700 (LWP 3521) exited]
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff7e7fc700 (LWP 3524)]
__GI___libc_free (mem=0xb5) at malloc.c:2929
2929 malloc.c: No such file or directory.
(gdb)
(gdb) where
#0 __GI___libc_free (mem=0xb5) at malloc.c:2929
#1 0x000000000077b7b5 in grpc_byte_buffer_destroy ()
#2 0x0000000000773ac3 in grpc::Server::SyncRequest::CallData::~CallData() ()
#3 0x000000000077405a in grpc::Server::SyncRequestThreadManager::DoWork(void*, bool) ()
#4 0x0000000000776111 in grpc::ThreadManager::MainWorkLoop() ()
#5 0x000000000077616c in grpc::ThreadManager::WorkerThread::Run() ()
#6 0x00007ffff6c9fbf0 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7 0x00007ffff6efa184 in start_thread (arg=0x7fff7e7fc700)
at pthread_create.c:312
#8 0x00007ffff640737d in clone ()
at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
So if it is ok to start multiple serverbuilder s , what else could the above error point to that I could possibly be doing wrong in using the grpc library.. I am taking this code over for someone else who wrote it, so my knowledge is lacking in the use of grpc... I don't think grpc is that unstable, so it must be something I am doing by using it incorrectly.
any ideas would be helpful..
any suggestions to debug it better would be helpful.
for grpc building, I am just doing the following:
$ git clone -b $(curl -L http://grpc.io/release) https://github.com/grpc/grpc
$ cd grpc
$ git submodule update --init
$ make
$ [sudo] make install
Is there options to compile differently which might provide more information?
thanks in advance for the help/suggestions.
Bob
It turns out, a small memory leak (missing close of a socket file descriptor) in the registered service function was causing the issue.

Docker Wordpress CPU jumps when selecting featured image

I have a DO VPS with 1GB RAM and 1 CPU.
I started to see some issues when I selected a Featured Image. I lost the database connection, so I checked my stats both docker and DO CPU stats, and shortly after selecting an image of around 1meg in size, my wp container's CPU skyrocketed and either caused the memory to run out or just did not do anything?
Then had the following, when tried to run docker ps -a:
runtime/cgo: pthread_create failed: Resource temporarily unavailable
SIGABRT: abort
PC=0x7f55471d3428 m=0
goroutine 0 [idle]:
goroutine 1 [running]:
runtime.systemstack_switch()
/usr/local/go/src/runtime/asm_amd64.s:245 fp=0xc820020770 sp=0xc820020768
runtime.main()
/usr/local/go/src/runtime/proc.go:126 +0x62 fp=0xc8200207c0 sp=0xc820020770
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc8200207c8 sp=0xc8200207c0
goroutine 17 [syscall, locked to thread]:
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:1998 +0x1
rax 0x0
rbx 0x7f5547562700
rcx 0x7f55471d3428
rdx 0x6
rdi 0x17a
rsi 0x17a
rbp 0xea3bde
rsp 0x7ffdf52e7938
r8 0x7f5547563770
r9 0x7f5547ba8700
r10 0x8
r11 0x202
r12 0x2cfb050
r13 0xe6f464
r14 0x0
r15 0x8
rip 0x7f55471d3428
rflags 0x202
cs 0x33
fs 0x0
gs 0x0
Has anyone else seen something like this?
I have done a LoadImpact test which went for 5 min without any issues.
Any advice on how to troubleshoot this?

gdb can't run programs linked against libcrypto

On raspbian , I can't run a program linked with libcrypto from gdb. It doesn't matter what the program contains. Example:
(gdb) r
Starting program: /home/pi/test/test
Program received signal SIGILL, Illegal instruction.
0x400844c0 in ?? () from /usr/lib/arm-linux-gnueabihf/libcrypto.so.1.0.0
(gdb) signal SIGILL
Continuing with signal SIGILL.
Cannot access memory at address 0x0
Program received signal SIGILL, Illegal instruction.
0x400844c8 in ?? () from /usr/lib/arm-linux-gnueabihf/libcrypto.so.1.0.0
(gdb) signal SIGILL
Continuing with signal SIGILL.
Cannot access memory at address 0x0
[Inferior 1 (process 1566) exited normally]
(gdb)
*

Resources