Edited: Compiler needs GC mode 2 :)
Monkey Forums/Monkey Bug Reports/Edited: Compiler needs GC mode 2 :)
| ||
UPDATE TO THE UPDATE: Seems it was a memory problem after all, as changing the GC mode has fixed it. Have to say it did a REALLY convincing job of looking like 'number of imports' was the problem, as the same amount of code compiled if arranged in fewer files. UPDATE: After much faffing and fiddling I discovered the crash is purely down to how many files are being imported during compilation. See bottom of this post for history of the problem. To cut a long story short, I tried commenting out half of spine.monkey (which is just a big list of imports) to try and narrow down the problem. I ended up with a project that would compile if just the last five files commented out. I could swap any of those files with any of the others and it would still compile, but if I imported just one extra file (even if it had no code in it) the compiler would die. I then copied the contents of the last five files into one of the other spine files, and it compiles with no problems. The spectacular irony here is that a few weeks ago, at the urging of another coder on the team, I went through our UI library and split everything into nice, readable, individual files. ========================================================= I've been trying without success to integrate the Spine runtime module with an existing project. The Spine example project compiles ok, but if I so much as "import spine" or "import spine.spinemojo" in MY project, I get this: Semanting... Translating... Building... This application has requested the Runtime to terminate it in an unusual way.Please contact the application's support team for more information. terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc Abnormal program termination. Exit code: 3 I tried running transcc through the VS debugger, but that was no help. For one thing, even if I recompile transcc in debug, there's no debug information for VS to latch on to. For another, if I run transcc through the debugger (no modifications; just the executable I've been using all year to compile the project) it crashes with an illegal memory access during "Semanting..." regardless of whether I'm importing spine or not. It's a frustrating situation because Spine is an ideal solution to our project's needs, but I only have a very small window of time to get it integrated. Does anyone have any idea what could be causing the compiler to SIAD, or failing that, how to get a working, debuggable version of transcc I can pick through myself? Thanks. Andy |
| ||
NB: using Monkey 80F and latest Jungle. |
| ||
Does it happen with Ted too? |
| ||
Did you look to see how much memory transcc is using? Mark mentioned in the past that there is no gc running because the runtime is so small. |
| ||
@MikeHart: It is not a Jungle Ide thing, if the compiler crashes, it's a compiler thing. I suspect is caused by Trans not using any GC at all. Try recompiling trans with non mojo-based GC. |
| ||
To elaborate on what Ziggy and GW_ said, use garbage collection mode 2 when rebuilidng transcc. This can be done by using the 'CPP_GC_MODE' preprocessor variable on targets using the C++ based garbage collector. For more information, click here. In addition, you could look into 'CPP_GC_TRIGGER' and 'CPP_GC_MAX_LOCALS' (As seen on this page). Assuming this fixes the problem, Mark should probably enable the garbage collector for transcc. EDIT: By the way, have you tried debugging 'transcc' itself? If you need to do it externally (Via Visual Studio), you'll need to make a debug build of transcc with the GLFW target (Disable 'GLFW_USE_MINGW'). The C++ Tool (STDCPP) target does not support MSVC directly. At least from what it's saying in your output, I think it might be an issue with the size of the post-translation output (Or perhaps transcc is using too much memory by the time the native compiler runs). |
| ||
It's using a lot of memory, but nowhere near the 32 bit limit. I've been caught out by the GC before with a preprocessing tool, and that only fell over when it was up around 4GB. The crucial thing is that it's purely the number of imports, not the total amount of code (input or generated). The compiler crashed even if I removed all code from several files. Conversely, I amalgamated the code from a bunch of Spine files and it's now compiling perfectly happily. Thanks for the tips RE transcc gc and debugging. It's not something I can squeeze in right now but I'll look into it when I get the chance. |
| ||
Interesting! Any idea on the number of Imports it took to cause the crash? |
| ||
I'll figure that out when I get the chance to dig deeper. |
| ||
Update: Seems like it was a memory problem after all. I don't know why moving code around and reducing the number of imports made a difference, but it did. I've recompiled with on-the-fly GC and it works fine now. As an aside, is there any scope for speeding up the "Semanting..." phase? |
| ||
Speeding it up? I mean, threads are an option, but that's Monkey 2 territory. You could try upping the priority of the process, but that's about it. What you're talking about is the phase responsible for evaluation of your source code. The most I can recommend is looking at what you're reflecting. There's also generics, which slows this down, but even with CRT patterns, this isn't a major time-hit. Honestly, transcc is pretty fast as it is, even if it could be faster. The real bottleneck tends to be the native compiler. I mean, have you tried the HTML5 target? That builds really quickly, even for large projects. |
| ||
AFAIK the html target handles tinting of images poorly (or it was doing last time I checked), so it's impractical to run it that way. The 'semanting' phase is currently taking substantially longer than the native build phase. I haven't touched reflection, so that I don't know about. We do use a lot of generics. |
| ||
Actual figures for the above: Semanting takes 49 seconds Translating to game window appearing on screen takes 28 seconds. |
| ||
As a test I manually combined 84 of the source files from our custom UI module into just two. That brought the 'Semanting' time of the project down from 49 seconds to 37 (a saving of a quarter!). The project itself has around 130 source files. Our UI module has 187. Diddy has 33, Mojo about 20, brl 30, and there are a few others, so around 400 all told. 84 is around a fifth of that total, so there's a reasonable correlation between the number of files being imported and the Semanting time, given the same total lines of code. I'm compiling on a solid state drive, and manipulating that amount of files is next to instantaneous, so there's definitely some significant compiler overhead in the importing process, over and above the amount of time it takes to read the code once it's loaded. |
| ||
What platform? According to Microsoft the C run-time libraries have a 512 limit for the number of files that can be open at any one time. I would add handle count to the control panel column to monitor trans and check it is closing it's files correctly. |
| ||
I don't think it's a hard limit thing. I amalgamated 62 files into 1 and it saved 8 seconds, and then amalgamated another 22 into 1 and saved another 4 seconds. |
| ||
Further investigation shows that essentially all the time is being spent in app.semant. Disabling reflection (to be certain I rebuilt the compiler with the reflection check commented out) makes no difference at all, and there is no disc activity (presumably that all happened during Parsing). I added some crude stat tracking to the compiler and found what I think must be the smoking gun. With the project as-is, I get these stats from ScopeDecl: GetDecl_Success: 169091 GetDecl_Fail: 12724495 FindDecl_Count: 112098 FindModuleDecl_Count: 1801 With 62 files combined into one, I get these stats: GetDecl_Success: 169091 GetDecl_Fail: 10818843 FindDecl_Count: 112098 FindModuleDecl_Count: 1801 With fewer files, the number of failed GetDecl attempts drops by 15%. The "Semanting..." time (now on my slightly beefier personal laptop) drops from 45 seconds to 38 - almost exactly 14.5% quicker. That's pretty interesting, as it suggests that the speed of the semanting process is governed almost entirely by how many files (and hence scopes) a project is broken down into. I'm going to look into ways of short-cutting the search process. I'll keep you posted. |
| ||
The only things I can think of are Mark's heavy usage of dynamic casts, and poor paging performance / superfetch performance. Considering you're running an SSD, I doubt it's much of a disk bottleneck. And though dynamic casts are bad on many levels, a -O3 release build with GCC/MinGW, coupled with modern hardware shouldn't perform terribly. Regardless of the bottleneck, the fact is, Monkey's build system is rather problematic. It generates a single native source file. And since compilers like g++ can't optimize builds based on individual changes, and instead can only do this based on per-file changes, you have to compile everything but the native frameworks every time. So, because of this, and a number of other problems, Mark's making Monkey 2. Following the logic that Mark wants to build a better compiler, I can only assume this includes the semantic phase. As far as the compiler's design goes, it's a mess. The fact that dynamic casts are used anywhere other than corner cases is absurd. It's just not good practice at all. Other than being a specialization nightmare, they're similar to virtual calls, where the big performance hits are caused by cache misses. But unlike virtual calls, modern CPUs aren't anywhere near as good at handling dynamic casts as they are branch prediction. x64 CPUs today are good enough at dealing with this kind of thing, but it wouldn't surprise me if the abundant dynamic casts had a part in the performance problems. Especially when it's scaled up to something like this. What I think needs to be understood is that Monkey's compiler is fast, but it's not exactly efficient, and it has some big design problems. Because of this, it doesn't scale as well as it should. Unfortunately, to speed up the problems I mentioned, you'd have to overhaul a lot of it. This includes the other passes. The only other thing I can say is to try configuring the garbage collector further (Max locals, etc). Or, you know, wait for Monkey 2, but that's a while out. I guess theoretically an easier way to chrunch that last bit of performance would be to make the translator output into separate files, then use g++'s smart building functionality. But that's just for the sake of bringing the time down. |
| ||
It's definitely not a disk bottleneck, because there's no disc access during the semanting phase. It actually runs quicker on my personal laptop (with a slightly faster CPU) building the project from my secondary non-SSD. Based on my tests, I think the most severe bottleneck is far more straightforward than the issues you raise: the time spent compiling a given quantity of code scales almost linearly with the number of files that code is spread over. Looking at the declaration search process, it's pretty clear why this is the case, too. I'm working on an optimisation now. |
| ||
Ok, I've implemented the following optimisation to ModuleDecl.GetDecl: If the ModuleDecl is dirty (is new or has had decls added since it was last queried - thanks 'reflection' for that particular head-scratcher), it proceeds as follows: First, create a Public Access List of Modules by walking the public import lists in the manner of the original GetDecl method. Then create a Private Access List by cloning the above and walking any other Modules privately imported by this ModuleDecl. It then scrapes the declmap of each module in these lists to create public and private Indirect Ident Maps. An Indirect Ident Map (IIM) is a stringmap of idents to SynonymLists. A SynonymList is a list of decl/moduledecl pairs that all match the same ident. The end product are two maps of every synonym of every decl reachable from, or via an import of, this module, as well as the aforementioned accessibility lists, which will also come in handy. Once this is done, or if the ModuleDecl is 'clean', it proceeds as follows: If the _env.ModuleScope MATCHES this ModuleScope, it grabs the SynonymList from the Private Indirect Ident Map, and picks the first decl that satisfies the prevailing accessibility criteria (copied from the original GetDecl method). You still get an error message if more than one match is found, just like before. If the _env.ModuleScope does NOT MATCH this ModuleScope, but the _env.ModuleScope IS in this module's public access list, it checks the _env.ModuleScope's Private IIM and this module's public IIM, which as far as I can work out duplicates the behaviour of the original code. If the _env.ModuleScope does NOT MATCH this ModuleScope, and ISN'T in this module's public access list either, it just checks this module's public IIM. I did some test runs on our project in which both the old and new GetDecl were called and the results compared, and all 169,000 calls match. Performance-wise, it now semants our project in under 4 seconds, down from 49 :) There's probably a bunch more fat to be cut (given the huge overlap in the accessibility lists of different modules), but 4 seconds is not enough for me to spend any more time on. If anyone's interested in the changes (purely to decl.monkey), let me know. EDIT: My implementation requires a 'clone' method to be added to Map, so there's that. |
| ||
I'm actually really interested in your implementation, and if it's stable, you should make a pull request on GitHub. Or at least fork Monkey publically from there. If you're getting this big of a performance boost, then it's definitely worth adopting. |
| ||
EDIT: Don't use this yet! I was mulling it over on the way home and realised I'd cocked something up. Inserting new decls into modules has to invalidate all cached data, not just the cache of the affected module. Kind of surprised that didn't show up in the soak test, actually. Anyway, I'll fix it tonight at some point. ==================================================== Would you believe I've never had to use Github before? Think I got my head around it eventually: https://github.com/JocelynSachs/monkey That's got the Map change and the Decl change. Unfortunately I wasn't able to test them out in that version as I'm on the clock and we aren't using that version of the modules for our project. I can only apologise if you find they don't compile :) |
| ||
Ok, brain fart fixed. Still haven't had a chance to compile those changes against the latest modules; sorry. Feel free to try it out yourself, or else I'll try to find the time soon. |
| ||
Compiler errors now fixed courtesy of Anthony D. Much obliged :) |
| ||
@Peeling: I've been using this for a while without any problems, any thoughts on making a pull request? |
| ||
I've submitted a pull request yesterday. |