Segfault on ArchLinux/Manjaro but not Mint/Linux

Archives Forums/Linux Discussion/Segfault on ArchLinux/Manjaro but not Mint/Linux

Derron(Posted 2016) [#1]
double post (because of editing)


Derron(Posted 2016) [#2]
I got reports of some of my players (or better wanna-players). They got crashes right inbetween loading of my game.

I have done a simple VM-setup of "Manjaro Linux" (like ArchLinux it is using "pacman" etc.).

Now, having this VM running I was able to run a GDB-run of my debug-build (in MaxIDE it just ended, no segfault-report, no nothing, it just aborted execution):

#0  0xf33d097a in llvm::DataLayout::setPointerAlignment(unsigned int, unsigned int, unsigned int, unsigned int) () from /usr/lib32/libLLVM-3.8.so
#1  0xf33d1ce9 in llvm::DataLayout::reset(llvm::StringRef) ()
   from /usr/lib32/libLLVM-3.8.so
#2  0xf34dcaa0 in llvm::Module::Module(llvm::StringRef, llvm::LLVMContext&) ()
   from /usr/lib32/libLLVM-3.8.so
#3  0xf33be48a in LLVMModuleCreateWithNameInContext ()
   from /usr/lib32/libLLVM-3.8.so
#4  0xf5f80447 in ?? () from /usr/lib32/xorg/modules/dri/swrast_dri.so
#5  0xf5f8061f in ?? () from /usr/lib32/xorg/modules/dri/swrast_dri.so
#6  0xf61d8364 in ?? () from /usr/lib32/xorg/modules/dri/swrast_dri.so
#7  0xf61d0db1 in ?? () from /usr/lib32/xorg/modules/dri/swrast_dri.so
#8  0xf61bb714 in ?? () from /usr/lib32/xorg/modules/dri/swrast_dri.so
#9  0xf5d6855c in ?? () from /usr/lib32/xorg/modules/dri/swrast_dri.so
#10 0xf5d2fd52 in ?? () from /usr/lib32/xorg/modules/dri/swrast_dri.so
#11 0xf5d12d1b in ?? () from /usr/lib32/xorg/modules/dri/swrast_dri.so
#12 0xf5d1334d in ?? () from /usr/lib32/xorg/modules/dri/swrast_dri.so
#13 0xf5d1a9f9 in ?? () from /usr/lib32/xorg/modules/dri/swrast_dri.so
#14 0xf5be8fcf in ?? () from /usr/lib32/xorg/modules/dri/swrast_dri.so
#15 0x0851d5f1 in _brl_glmax2d_TGLMax2DDriver_SetColor ()
#16 0x0852154d in brl_max2d_SetColor ()
#17 0x08286c36 in _bb_TColor_setRGBA ()


As you see my code is calling Max2D to set a new color ... and then "swrast_dri.so" is failing somehow.

I assume that it has to do with some OpenGL implementation.


Edit: As it does not fail when compiled with NG, the ASM code might be the culprit - or the modules?

Edit2: same happens when trying to execute "digesteroids". I wrapped some prints around the SetAlpha/SetColor ones... and it then just somewhen fails for one of the calls. As if it was not able to send the GL-commands to its destination.


When having a simple
SuperStrict

Framework Brl.StandardIO
Import Brl.Graphics
Import Brl.GLMax2D

Graphics(800,600)

Repeat
	Cls
	DrawRect(100,100,100,100)
	Flip
Until KeyHit(KEY_SPACE) Or AppTerminate()

it works when commenting out the DrawRect(). Else it fails with:
#15 0xf7d8e3e2 in ?? () from /usr/lib32/libGL.so.1
#16 0xf7d690c4 in glXSwapBuffers () from /usr/lib32/libGL.so.1
#17 0x0804b8ad in _swapBuffers (context=<optimized out>)
    at /home/ronny/BlitzMax/mod/brl.mod/glgraphics.mod/glgraphics.linux.c


I also updated FASM to 1.71 and added "#define PNG_NO_MMX_CODE" to png.h in pub.mod/libpng.mod when trying digesteroids with a vanilla 1.50-download (and module rebuild with the new fasm)


Ideas what to do now?

bye
Ron


RustyKristi(Posted 2016) [#3]
This is what I've been talking about. I would assume your players are mostly 64bit linux users as the problem persist mostly on 64bit system. At least that's what I've been getting so far.


dawlane(Posted 2016) [#4]
First things first. Using a VM is unpredictable. Setting your system up to use Logical Volume Management would allow you to install a few distributions side-by-side, but there are some caveats with some distributions not playing well together.

Second:
swrast_dri.so is the software render driver library. You would normally see this if there were no hardware accelerated drivers available. It is possible on some distributions for this driver to get loaded even when you have installed proprietary drivers. This happens usually with bad configuration where the wrong version of libGL.so gets loaded. The command chpath should be able to rectify this when it happens.

Third:
There are issues with the assembly being generated. Some 64 bit versions of GCC don't like the MMX instructions and with BlitzMax you get an undefined reference error.
And with Debian 8; I can only get debug builds to work. Release builds just segfault with no indication of what went wrong. And I frankly I just cannot be arsed to find out and have much better things to do.

Fourth:
ArchLinux and it's derivatives, doesn't have all of the required libraries as binaries that you can install from the repositories. LibXPM4 springs to mind here.

As I have said in a few posts now. It's time to start looking at alternative tools if you wish to do cross platform software that includes Linux. I cannot see anybody taking the time to fix all the problems that have been in BlitzMax when it was still on sale. Let alone now it's free.


Derron(Posted 2016) [#5]
@ Software rendering

As I said: when doing a build with BMX-NG, it runs fine on these systems (on the user's machines...and in my VM).
And this is what mad me wonder.
But yes...system reports that screen 0 does not support DRI2... but then, why does it work so nice with NG? Maybe an module issue then (NG uses some adjusted modules)?

If NG failed tooo... I would have thought of modules/my programme code (when failing on the user's computers too) or problems with the VM software (if it only failed here)

So I think it has to do with the ASM code (but this does not explain why it works on one Distro but not another).


@ libXpm(.so.4)
I had to activate AUR to be able to install it (so it compiled it from the sources). But I installed it already to be able to start MaxIDE (fltk variant).


@Arch
It uses other glibc etc. So things compiled there won't run on my Mint-setup
./TVTower.debug: /usr/lib/i386-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.9' not found (required by ./TVTower.debug)
./TVTower.debug: /usr/lib/i386-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by ./TVTower.debug

Might things come together building this bug/problem of a defunct BlitzMax (1.50)?


Thanks for your reply.

Bye
Ron


dawlane(Posted 2016) [#6]
So I think it has to do with the ASM code (but this does not explain why it works on one Distro but not another).
Would come under "Third" and would be a bcc issue and maybe whatever is getting passed to GCC via bmk. As BlitzMax is not the easiest to debug at the lowest level. It would be hard to tell what is actually happening. There is always the possibility of subtle differences with how the libraries are built between distributions. Again this would depend of what options were passed to the GCC tools. The default GCC options tend to differ slightly from version to version as well as distribution to distribution.

But I would be curious as to why there are references to what obviously looks like the LLVM runtime?

It uses other glibc etc. So things compiled there won't run on my Mint-setup
ArchLinux if I remember is a rolling release. It would have the more newer libraries, but should still be able to run old software.
If you build something on ArchLinx, it will not work on older versions of Mint/Ubunt/Debian and then it may not work on any of the newer version either.


Derron(Posted 2016) [#7]
@ LLVM
As far as I have read "on the internet" this is because it uses software rendering.

Another possible reason is:
OpenGL vendor string: VMware, Inc.
OpenGL renderer string: Gallium 0.4 on llvmpipe (LLVM 3.8, 128 bits)


But when running "glxinfo[32]" it prints
direct rendering: Yes



Without support of "DRI2" it uses the software renderer. Other ArchLinux-users reported isues with "steam" and some proprietary drivers - they got segfaults then (because of that DRI2-issue).


Awaiting responses of the users reporting the issue first. If they have also problems there, this will explain things:

$ LIBGL_DEBUG=verbose glxinfo32 | grep OpenGL
libGL: screen 0 does not appear to be DRI2 capable
libGL: OpenDriver: trying /usr/lib32/xorg/modules/dri/tls/swrast_dri.so
libGL: OpenDriver: trying /usr/lib32/xorg/modules/dri/swrast_dri.



Edit: But again: if BMX-NG is capable of presenting something on the screen ... I cannot imagine that it has to do with "software rendering" or "DRI2" or ....
Or does NG bind to something different than vanilla?

bye
Ron


dawlane(Posted 2016) [#8]
With any VM make sure that any guest addition are installed. I should point out that sometimes they can be flaky with some distributions. There has been a time or two where I've installed the guest additions and it broke the OS that was being run as a VM.


Another possible reason is:
OpenGL vendor string: VMware, Inc.
OpenGL renderer string: Gallium 0.4 on llvmpipe (LLVM 3.8, 128 bits)
That would be it then.

Or does NG bind to something different than vanilla?
I wouldn't say bind. From what I understand, bcc_ng works like it does with MonkeyX. It creates C/C++ source files that should be complied and linked correctly where as bcc vanilla generates pure x86 assembly (with custom optimised code) that may not. It would come back to "There is always the possibility of subtle differences".


Derron(Posted 2016) [#9]
@ assembly
Ah, ok, so it does what gcc does "at the end". Means: while GCC seems to create something "runnable", the one BCC(vanilla) creates - might contain something "problematic".

I am no coding guru (nor expert nor ...) so I do not know whether one could see differences in the ".a/.i/.s/../whatever"-files BCC/BCC-NG-VIA-GCC create. If so, this might be a chance to fix something.

For now it seems the best bet is to just compile via NG :-)


@guest additions
Yes, for Arch/Manjaro they suggest to use the GA the repos offer, as they might be patched to work better than the original ones. I tried both variants without improvements.


Think this issue is an aspect at which NG should not try to behave similar to vanilla :-)


bye
Ron


Derron(Posted 2016) [#10]
Just got another feedback:


Wenn ich es auf einer Maschine (auch Manjaro XFCE) mit NVidia Karte laufen lasse geht es.
Liegt es vieleicht an der Onboard Intel VGA Karte ?



Translation:
If I let it (Manjaro XFCE) run on a machine with NVidia card (gpu), it works.
Maybe it it is related to the onboard intel gpu ?


This then would mean: driver issue. I believe that the "VBox gpu driver" is similar to S3 or intel ones. Maybe the similarity is that high, that both (drivers) share a behaviour which does _not_ play nice with BlitzMax.


bye
Ron


Brucey(Posted 2016) [#11]
Maybe the similarity is that high, that both (drivers) share a behaviour which does _not_ play nice with BlitzMax

Yet it works with a BlitzMax NG-built binary?

Did you try a 32-bit NG binary?
(could be a 32-bit driver issue)


Derron(Posted 2016) [#12]
I tested it with x86 binaries built wih NG.

Edit: 14:24
Tested x64 builds too ...working without trouble.

Bye
Ron


dawlane(Posted 2016) [#13]
GCC seems to create something "runnable"
GCC does exactly what bcc does - but better as it will be outputting assembler in one of the two supported formats....GNU and Intel MASM depending on the options passed. The problem is that FASM is part of the tool chain with bcc and incompatible with the native assembler.

Simplified tool chain for NG
Compiler: BlitzMax_NG (bcc with bmk controlling outputs C/C++ files)-> gcc/g++-> Assembler: as(aka gas Gnu ASembler)/MASM-> Linker: ld

so I do not know whether one could see differences in the ".a/.i/.s/../whatever"
.a These files contain archived object files (.o). The object files are the basic building blocks that just contain compiled machine code instructions. They will also have the symbol tables to fucntions etc ready for the linker to bind them into an executable or shard library.
.i These look like interface files for bcc to use. Think of these as the DEF files that you see with Microsofts LIB format files.
.s These are the actual assembly source files and look to be in flat assembler format. Which would be bloody hard to debug with gdb as there is no decent debugger on Linux for it. Not sure if fasm2as.bmx in the src directory is for doing this and I don't recall seeing it being used anywhere in any of the BlitzMax sources? But then if it was it would be out of date.

Edit: If there are any differences it would be in the generated assembly files. But then - they would be totally different due to how the assembly was generated.

Did you try a 32-bit NG binary?
(could be a 32-bit driver issue)
I would have tested on both NG and vanilla on a pure 32bit distribution and a pure 64 bit distribution. If it works on a 32 bit distribution then I would know that there was a problem somewhere with the 32 bit libraries on a 64 bit distribution.

Edit: I believe that these were the packages that it used for archLinux. But they could be out of date.
lib32-libx11 lib32-libxxf86vm lib32-glu lib32-freetype2 lib32-libxft lib32-alsa-lib lib32-alsa-plugins lib32-libpulse lib32-openal openal


Derron(Posted 2016) [#14]
@ libs
Wouldnt it fail with a different message if one of the libs was missing?

On manjaro 15.12 32bit it fails too.
On manjaro 15.12 32bit it does not fail (tried digesteroids).


Bye
Ron


dawlane(Posted 2016) [#15]
Wouldnt it fail with a different message if one of the libs was missing?
You would get the cannot find libblah.so messages. If that was the case.

If I let it (Manjaro XFCE) run on a machine with NVidia card (gpu), it works.
Maybe it it is related to the onboard intel gpu ?
Hardware information is always necessary when trying to nail down something like this.
lspci -vnn | grep VGA -A 20 would get you this information on the hardware.
Note: That for more information you need to run the command as root/sudo/su and increase the number after A

Intel drivers have been known to cause numerous problems with software that requires acceleration. Mostly down to the fact that drivers tend to be crap.
There maybe solutions, but they would require good knowledge on the hardware an drivers.


RustyKristi(Posted 2016) [#16]
The only configuration that works for me 100% right now is vanilla 1.5 linux and Ubuntu 14.04 LTS.

Can't do anything with 64bit version except brl.standardio stuff, always ends up in segfault.


Derron(Posted 2016) [#17]
You would get the cannot find libblah.so messages. If that was the case.


Which wasn't. So missing Libs seem not to be the problematic part.


@hardware
Still awaiting responses of that users. One seem to have tried it with an intel one ...


@ 32 bit
My brain mixed something up when I posted aboves post regarding "fails on 32 bit manjaro" (shouldn't write while waiting in the car).

This happens when running digsteroids on 32bit manjaro (built with "vanilla BlitzMax 1.50")
Executing:digesteroids
pci id for fd 13: 80ee:beef, driver (null)
libGL error: core dri or dri2 extension not found
libGL error: failed to load driver: vboxvideo

Process complete

But it run and displayed the game in fullscreen.


As you told already (rolling release) that built binary failed on the 64bit 16.x release of Manjaro.


@assembler
But even then. I still do not get completely, why "vanilla BCC" should fail on 64bit (32bit libs) and not on 32bit libs.
It translates "code" into "machine language". So if it fails there at specific spots, why does it fail only on a 64bit distro (and then, not on all distros).

Does it rely on some system functions to calculate addresses where it reads things from during build? I doubt that (as else, the 32bit-build of digesteroids would run on the 64bit one - but it only segfaults with the same reason as the build done on the 64bit OS itself).

So this is why I do not fully understand what kind of "error" BCC could make that it works on the one OS and not on the other.

If it wasn't a fault of BCC, it must be the libs... but why then do these libs work with BCC-NG?


Regarding "comparison": #2 contained a small example only requiring a small portion of all modules - dunno if that helps those people being able to understand what is written in the .bmx-files.


bye
Ron


Derron(Posted 2016) [#18]
Just additionally tried "Manjaro 16.x 32Bit" (the 32bit equivalent of the 64bit one where it fails).

unzipped "BlitzMax150_linuxx86.tar.gz", run MaxIDE and compiled + executed + run "digesteroids" without problems.

So - when assuming "32bit and 64bit" have the same library-versions, there are no incompatibilities added to the libs during the last (stable) revisions (so when upgrading my mint, it does not break too ;-)).


bye
Ron