Skip to main content

Ninja causing OOM!

 Ninja has been around for a while, since 2012 or so.


Ninja is a small build system with a focus on speed. https://ninja-build.org/


I had heard of it and an engineer I worked with at Cybex I think was using it as his build system. Ninja's goal is to be faster than GNUmake, the de-facto make utility. 

David Rothlis did some benchmarks on ninja and the no-op build (a build with no or few changes vs. a clean build) is significantly faster with ninja.

I just started hacking on Kicad for fun. Kicad is an excellent open source electronics design tool that I've used to design and build several PCBs.

Image of iCEBreaker Bitsy in the KiCad PCB Editor


Kicad devs recommended using ninja for builds as its faster. Why not? Ninja has been around for a long time, so it should be pretty well developed, and it's a good opportunity to see how it works. If it works well it could be a good thing to adopt across a number of other projects that are presently using GNUmake.


So I'm off building Kicad, MacBook fans are cranking, cpu load is 1000% (which means several processors are working in parallel) and the code is building along rapidly. This is a good sign! GNUmake defaults to a single job (you can say make -jX to use X jobs instead, but you have to remember to do this). With a single job you end up with a single instance of the compiler building a single file.

Modern computers have several processor cores. Unlike GNUmake, ninja makes use of those processor cores by default, no command line arguments are necessary to get this parallel build behavior. 

A few minutes later I noticed the terminal where I initiated the 'ninja' build had disappeared. That's odd... I immediately suspected the OOM killer (out-of-memory killer) but who knows...

Created a new terminal and kicked another ninja build off and it completed.

When I went to rebuild I was watching more carefully. Sure enough the same thing occurred. Now I really suspected the OOM killer. The OOM killer looks for processes that are using too much memory and if the system is at risk of slowing to the point of being unusable, it will kill the offending processes.

Sure enough looking at the logs via 'journalctl -r' it was an OOM situation:


Why is ninja causing an OOM condition? Is it something configured or used incorrectly?

In my case I'm using qemu to run Ubuntu 22.10. I've got an 8-core i9 2019 MacBook Pro with 32GB of ram. The vm is being given access to all processors (16), and half of the ram, 16GB.

Ninja is spawning jobs based on the 16 processors it sees being available. Ninja doesn't presently check how much memory is available on the system, and Kicad is complex enough that each instance of gcc is likely taking a few hundred megabytes to compile each file. At some point you end up with a handful gcc instances that are compiling particularly large files and the OOM sees that the gnome terminal, of which the ninja instance is a child, is taking up almost all of the system memory.

Others have reported hitting the same issue in GitHub issues, and there is this open issue, Ninja and RAM/Memory usage #2187.

Ninja is SO good at using all available compute resources that it ends up pushing ram usage far more than GNU make does. As I mentioned in my comment on issue #2187, I worry that far more people could be hitting this issue and switching away from ninja without reporting or debugging it.

So I can kick off a build and come back to it being complete rather than cancelled I'll have to figure out how to get ninja to not trigger the OOM condition. I think 'ninja -j10' or something could do it, we'll see!

Comments

Popular posts from this blog

Debugging an imprecise bus access fault on a Cortex-M3

This information may apply to other cortex series processors but is written from practical experience with the Cortex-M3. Imprecise bus access faults are ambiguous, as noted by the term "imprecise". Compared to precise bus errors, imprecise errors are much trickier to debug and especially so without a deep understanding of arm processors and assembly language. Imprecise and precise flags are found in the BusFault status register, a byte in the CFSR (Configurable Fault Status Register). BusFault status register bits The definition for imprecise and precise bits is: [2] IMPRECISERR Imprecise data bus error: 0 = no imprecise data bus error 1 = a data bus error has occurred, but the return address in the stack frame is not related to the instruction that caused the error. When the processor sets this bit to 1, it does not write a fault address to the BFAR. This is an asynchronous fault. Therefore, if it is detected when the priority of the current pr...

Travelling on Spirit airlines out of Boston Logan airport? Here are some tips.

I attended CES 2017 in Las Vegas. Booking the trip late I ended up on Spirit airlines. It was both non-stop, making it six hours to Las Vegas from Boston, and affordable, less than $300 for a one way trip compared to around $700 with JetBlue. Here are some tips that might help you when travelling on Spirit from Boston Logan airport. Eat Spirit is located in the B-terminal, gates B-37 and 38, with its own TSA security checkpoint. While it does have restrooms and places to sit the food selection is limited to a single food stand. I'd recommend eating at the Legal C Bar (number 77 in the image below) prior to going through the terminal security checkpoint. The food and service there were great. Drink The water and other drinks are cheaper if you buy them at the food cart rather than on the flight. Seats The seats on Spirit don't recline. They do this to reduce weight, seat cost, seat maintenance costs, and so seats don't impact the free space of other passengers,...

Yocto recipe SRC_URI for a BitBucket / GitHub ssh git repository

This is a particularly geeky post but because Google searches didn't turn up any information I thought it would be helpful to document the issue and solution for others. I was writing  Yocto recipes that pulled from BitBucket git repositories in ssh form and ran into several issues getting a SRC_URI that worked. GitHub uses the same syntax for their ssh repositories. A BitBucket / GitHub git url, in ssh form, looks like: < username >@bitbucket.org:< account name >/< repository name >.git a more concrete example for a git repository in one of my BitBucket accounts looks like: git@bitbucket.org:cmorgan/somerepository.git Yocto recipes can pull from git repositories by setting the SRC_URI variable appropriately. Unfortunately you can't just do: SRC_URI = "git@bitbucket.org:cmorgan/somerepository.git You'll get errors because the Yocto won't know what kind of url this is. You need to specify the protocol for Yocto to k...