Linux: Segfault when binary has certain filename

Community Forums/General Help/Linux: Segfault when binary has certain filename

Derron(Posted 2016) [#1]
Howdy,

it's me again - this time with an odd scenario.

While including some of Brucey's magic performance tricks into my game and framework, I run tests with my "demoapp.bmx"-testsuite for GUI widgets.
Suddenly things went strange as it segfaulted ~50% of the times.

I truncated code down to:


And it still segfaulted from time to time.

Removing everything except printing "OK" ... run without trouble.

Running the above thing via GDB lead to a Backtrace of 1 entry:
Program received signal SIGSEGV, Segmentation fault.
0xf612a23d in ?? () from /usr/lib/i386-linux-gnu/dri/fglrx_dri.so

So it seems, as if the driver does complain about something.
(actually it crashes at the "graphics"-call as this tries to initialize the graphics context).

I wondered why my game never did crash at this point - nor did it when compiling it now (using the same code).

I then restarted the computer - and oddly again only that "demoapp" was crashing.

I played with it, checking imports and so on ... .
Then I copied the file to another directory: it still segfaulted.

Ok ... let's do some magic: I renamed! it to "testme" and run that a bunch of times without segfault. I then renamed it back to "demoapp" and voila...segfault.


summary:
- source code above "segfaults" with 50% chance when binary is named "demoapp"
- copying binary to somewhere else still leads to segfault
- renaming binary (copied or original) to "testme" made it run flawless
- rebooting computer (power off, 10s wait, power on) did not change anything
- copying the "demoapp" binary to _another_ hdd (from SSD to SATA) still segfaults



Could somebody of you explain what might have been going on here?
(Attention: I _copied_ not _moved_ the file, so although linux file systems transparently allow moving even while writing to them, this means a "cache" cannot have been hit)


bye
Ron


dawlane(Posted 2016) [#2]
Using Vanilla or NG? If NG how long ago was the update? Threads used?
Switch to hidden files in the file manage an clear out any .bmx in the build directory.

You may need to install a few library debug files (X server stuff, glic, libstdc++) and source to pin it down.


Derron(Posted 2016) [#3]
Using vanilla (with pretty "fresh" brl.mod and pub.mod from maxmods - but using framework, so it should only concern brl.blitz and so on).


like said, it happens too, wehn copying the "demoapp.bmx"-file to another directory (or hdd) and compiling it there.


just tried again:

Opened up a new document in a new folder on another hard disc.

Pasted
SuperStrict

'keep it small
Framework BRL.standardIO

Import brl.Graphics

Import BRL.GLMax2D

SetGraphicsDriver GLMax2DDriver()

Graphics(800, 600, 0, 0, 0)

HideMouse()



Repeat

    Cls

    DrawOval(MouseX()-2, MouseY()-2, 4,4)           

    Flip

Until AppTerminate() Or KeyHit(KEY_ESCAPE)

saved it as "demoapp.bmx"
(This is the important thing - even renaming it to "demoapp2.bmx" gets rid of the problem - so "somehow" it must have something to do with caches or so).


compiled it ... segfault ... compiled again ... run ... ... then run...then segfault ..


So how to continue with the "library debug files" ?


bye
ron


Derron(Posted 2016) [#4]
OK ... I just opened up MaxIDE for "NG":

pasted the same content, saved it in
BlitzMaxNG/test/demoapp.bmx

Compiled as 64bit ... run ... run ... run ... run

Compiled as 32bit ... segfault ... run .... segfault.



bye
Ron


dawlane(Posted 2016) [#5]
I wonder if there is a corrupt i386 system library? If possible, move the application into a VM or into a clean Linux install. If it works then there is an issue with your system setup. Then try to build in the VM and transfer it over to your system.

So how to continue with the "library debug files" ?
Find the library debug symbols in the repository manager. They usually end with dbg e.g. libgtk-3-0-dbg

To install the source code for use in Debian/Ubuntu based systems, open a terminal and make a directory name /build in the system root directory. Change into this directory then use
sudo apt-get source_the source_code_package_name
You can find the package names from the Ubuntu Packages Search. One thing to not is that the debug symbols may complain about not being able to find such-and-such source file. Unfortunately the source packages don't include any addition file tree hierarchy's.

e.g. All SDL2 source package files have to go into a directory tree of /build/libsdl2-olgtWF/


Derron(Posted 2016) [#6]
As the issue is in "fglrx" (the segfault) I am not sure whether running a VM will help.

Also I am not sure whether AMD provides DBG-packages for their proprietary driver.


the demoapp run fine on a Linux Mint 17 32Bit install on a board with intel IGP (while mine is a 64bit one running an AMD igp - LLano chipset, so need to run "fglrx" instead of "radeon").


bye
Ron


dawlane(Posted 2016) [#7]
I am not sure whether running a VM will help.
If you can set your system up as a multiboot to install a 32 bit version along side your normal install; then you will know that it's a problem with the 64 bit installation of the i386 side of things. I've got my system set up with a number of OS's on a LVM. Each of the home directories internal file hierarchy has a number of links to another drive that is shared. The distributions users home configuration files stay in the distributions user home, while folder such as Desktop, Documents, etc are linked to the other drive's shared directories of the same name. All it takes is a script to do the dirty work each time.

Also I am not sure whether AMD provides DBG-packages for their proprietary driver.
They wont. But those other debug and sources files can be a big help.


Derron(Posted 2016) [#8]
What happens if it has to do with a certain setup "mix" - versions of libs, gpu driver ...

So even a fresh 64bit install might "work" then (mine is only some weeks old, reinstalled when buying my first SSD *wohoo* - replacing a years old "upgraded" installation).


Dunno if it is really worth the hassle of narrowing down the bug if the resolution is then something like "updating the gpu driver as they fixed it there already".

Just hoped for some nifty "sudden inspiration".


Nonetheless I will try to gasp some time tomorrow to boot up from stick.


bye
Ron


Derron(Posted 2016) [#9]
Hmm since one or two days I get another problem running my compilated files:

This time it happens if I called my binary file "TVTower" - so this is different to "demoapp" which _might_ have beend something hardcoded somewhere.

Now it does not segfault, but fails to create the graphical context (-> opening the gl window). so the application just "hangs".

Renaming the binary leads to a trouble free run on each execution.

(gdb) bt
Python Exception <class 'gdb.MemoryError'> Cannot access memory at address 0xdf63168b: 
#0  0xf52be247 in ?? () from /usr/lib/i386-linux-gnu/dri/fglrx_dri.so
Cannot access memory at address 0xdf63168b


Really seems to have issues with the 32bit drivers/libs.


Just wondering, why this happens even without related adjustments of system during the last days (no updates except for libbind/libdns etc.).
System gets restarted daily for a while now (= shut down during night times).


This happens when enabling a bit more verbosity:

when working:
$ LIBGL_DEBUG=verbose ./TVTower
[14:33:27] INFO     | CORE: Starting TVTower, v0.3.5.1 Build "15.11.16 14:32".
[...]
[14:33:27] DBG      | GRAPHICSMANAGER.INITGRAPHICS(): Initializing graphics.
[14:33:27] DBG      |                               : SetGraphicsDriver "OpenGL".
libGL: AtiGetClientDriverName: 15.30.3 fglrx (screen 0)
libGL: OpenDriver: trying /usr/lib/i386-linux-gnu/dri/fglrx_dri.so
ukiDynamicMajor: found major device number 249
ukiDynamicMajor: found major device number 249
ukiDynamicMajor: found major device number 249
ukiOpenDevice: node name is /dev/ati/card0
ukiOpenDevice: open result is 6, (OK)
ukiGetBusid returned 'PCI:0:1:0'
ukiOpenDevice: node name is /dev/ati/card1
ukiOpenDevice: UKI_ERR_NOT_ROOT
[...]
ukiOpenDevice: node name is /dev/ati/card15
ukiOpenDevice: UKI_ERR_NOT_ROOT
ukiDynamicMajor: found major device number 249
ukiOpenByBusid: Searching for BusID PCI:0:1:0
ukiOpenDevice: node name is /dev/ati/card0
ukiOpenDevice: open result is 6, (OK)
ukiOpenByBusid: ukiOpenMinor returns 6
ukiOpenByBusid: ukiGetBusid reports PCI:0:1:0
ukiDynamicMajor: found major device number 249
ukiDynamicMajor: found major device number 249
ukiOpenByBusid: Searching for BusID PCI:0:1:0
ukiOpenDevice: node name is /dev/ati/card0
ukiOpenDevice: open result is 8, (OK)
ukiOpenByBusid: ukiOpenMinor returns 8
ukiOpenByBusid: ukiGetBusid reports PCI:0:1:0
ukiDynamicMajor: found major device number 249
ukiDynamicMajor: found major device number 249
ukiDynamicMajor: found major device number 249
ukiOpenDevice: node name is /dev/ati/card0
ukiOpenDevice: open result is 9, (OK)
ukiGetBusid returned 'PCI:0:1:0'
ukiOpenDevice: node name is /dev/ati/card1
ukiOpenDevice: UKI_ERR_NOT_ROOT
[...]
ukiOpenDevice: node name is /dev/ati/card15
ukiOpenDevice: UKI_ERR_NOT_ROOT
ukiDynamicMajor: found major device number 249
ukiOpenByBusid: Searching for BusID PCI:0:1:0
ukiOpenDevice: node name is /dev/ati/card0
ukiOpenDevice: open result is 9, (OK)
ukiOpenByBusid: ukiOpenMinor returns 9
ukiOpenByBusid: ukiGetBusid reports PCI:0:1:0
[14:33:28] DBG      |                               : Initialized graphics with "OpenGL".
[14:33:28] DBG      |                               : Initialized virtual graphics (for optional letterboxes).
[...]


and when failing:
$ LIBGL_DEBUG=verbose ./TVTower
[14:33:34] INFO     | CORE: Starting TVTower, v0.3.5.1 Build "15.11.16 14:32".
[...]
[14:33:34] DBG      | GRAPHICSMANAGER.INITGRAPHICS(): Initializing graphics.
[14:33:34] DBG      |                               : SetGraphicsDriver "OpenGL".
libGL: AtiGetClientDriverName: 15.30.3 fglrx (screen 0)
libGL: OpenDriver: trying /usr/lib/i386-linux-gnu/dri/fglrx_dri.so
ukiDynamicMajor: found major device number 249
ukiDynamicMajor: found major device number 249
ukiDynamicMajor: found major device number 249
ukiOpenDevice: node name is /dev/ati/card0
ukiOpenDevice: open result is 6, (OK)
ukiGetBusid returned 'PCI:0:1:0'
ukiOpenDevice: node name is /dev/ati/card1
ukiOpenDevice: UKI_ERR_NOT_ROOT
[...]
ukiOpenDevice: node name is /dev/ati/card15
ukiOpenDevice: UKI_ERR_NOT_ROOT
ukiDynamicMajor: found major device number 249
ukiOpenByBusid: Searching for BusID PCI:0:1:0
ukiOpenDevice: node name is /dev/ati/card0
ukiOpenDevice: open result is 6, (OK)
ukiOpenByBusid: ukiOpenMinor returns 6
ukiOpenByBusid: ukiGetBusid reports PCI:0:1:0
ukiDynamicMajor: found major device number 249
ukiDynamicMajor: found major device number 249
ukiOpenByBusid: Searching for BusID PCI:0:1:0
ukiOpenDevice: node name is /dev/ati/card0
ukiOpenDevice: open result is 8, (OK)
ukiOpenByBusid: ukiOpenMinor returns 8
ukiOpenByBusid: ukiGetBusid reports PCI:0:1:0
^C


bye
ron


Derron(Posted 2016) [#10]
Followed this instruction:

http://askubuntu.com/questions/332526/error-while-loading-shared-libraries-libgl-so-1-wrong-elf-class-elfclass32#333070

(no, I did not have THIS error, but found it while searching for mine)

and renamed /usr/lib64 to /usr/lib64_bak
Removed the whole fglrx-driver and reinstalled the newest one.

Without that step fglrx first blamed not to find "libgl.so.1.2" - and once I re-symlinked, it failed with "cannot open display" (DRI and so on).


Now I can start my binary "TVTower" again without a segfault every second run.


Still do not know why it begun happening 2 days ago (as a manual fglrx-install is something I would have remembered).

Maybe it is "ldconfig" which updated some paths ?
I only downloaded monkey2 that day (or so) and did a "rebuildall"


bye
Ron