The 2024 Wheel Reinvention Jam is in 13 days. September 23-29, 2024. More info

A Modicum of Memory

Matt Mascarenhas
Until recently Cinera was a one-shot affair. A single run of the program would merely involve processing all the annotation files given to it, before exiting (typically in under a second). This speed-up alone is a pretty sweet step up from the existing annotation system for Handmade Hero which, on my machine and for reasons I'm really not sure I have the stomach to fathom, tended to take upward of a minute to do its stuff. In this state, however, Cinera would depend on some other service to monitor a repository or directory and trigger it into action when new annotations arrive for processing. Now, monitoring of Git{Hub,Lab,ea} repositories is still TODO, but as a step towards this I recently implemented monitoring of one local directory.

This was pretty straightforward to implement using the inotify API, and the new functionality nicely slotted in to the existing code which I had written with one eye on the program one day becoming an always-on thing. Now that we were always-on, though, I became more keenly aware of the program's memory consumption, and found to my horror (genuine!) that it could consume over 200MB of RAM. This really should not happen, especially as we only allocate one persistent chunk of memory (4MB) in which to do all our processing, and then allocate and free the odd chunk to temporarily contain the contents of files.

The >200MB situation was due to a little snafu in my new MonitorDirectory() function, whereby it would continuously claim a chunk of memory out of the 4MB memory arena to contain the inotify events, without declaiming that memory in all return cases. I still don't fully understand how it could keep claiming, or at least address into memory, supposedly past the end of the memory arena's allotted 4MB, so perhaps it'd be worthwhile trying to reproduce the behaviour and dive into exactly what was going on. The fix was just to call DeclaimBuffer() and return from the function after a new inotify event was processed, rather than continuing to iterate through the events.

Would that this were the end of the story! Further testing revealed that we still used way more than the expected 4MB, allowing for whatever memory may not be under our control. (Only today I discovered that a blank program consumes a significant non-zero amount of memory, and will be keen to delve into this stuff at some point.) In one test, a fresh run of the program, with an empty project directory and no index file, consumed 12124KB, increasing to 12744KB, at which point I began adding to the project directory the annotation files for RISCY BUSINESS Days 1 to 10, one file at a time. After each file, memory consumption increased (never decreased or levelled), and at the end of the process 25760KB had been consumed. I then ran the test again, this time moving the ten annotation files into the project directory in one go, after which process only 17536KB had been consumed - ~8MB lower memory consumption after producing, to all intents and purposes, the same output.

Where was all this memory going? Logging calls to ClaimBuffer() and DeclaimBuffer(), calloc(), malloc() and free(), and various other "big" functions just to get an idea of who was calling what, yielded no insights at all. All claimed buffers were accompanied by a declaim, likewise all allocations by a free. Perhaps, then, passing structs around by value, rather than address, was the culprit. But no, passing everyone by address actually seemed to increase the calls to mmap() as reported by strace. This is a test I would like to perform again, to see if there is actually a correlation between the passing of structs by value vs address, and the number of calls to mmap(), or whether something else caused the increased mmap() calls.

And then, for some serendipitous reason, I decided to try compiling the program without producing debug symbols or using the address sanitizer. Running the same test as before - empty project directory and no index file - consumed 4200KB. Eight megabytes fewer than previously, and much closer to our expected 4MB. I again proceeded to add the annotations for RISCY BUSINESS Days 1 to 10, one file at a time. After riscy001.hmml, 6240KB had been consumed. After riscy002.hmml, the same amount. All the way up to riscy010.hmml. This seemed promising. I then reset and performed the test again, this time adding Days 1 to 10 all at once. The initial memory consumption, with an empty project directory and no index file, was 4196KB - 4KB fewer than the previous test, and close enough to sustain my confidence in the program. After adding the ten files at once, the memory consumption had risen to 6236KB - yes, 4KB fewer than before. And Cinera has remained running while I've written this, reliably consuming 6236KB, and in a state where I feel ready to move on to the next thing.

So the moral of this story is: if in doubt, remember to try turning off tools that are your life raft in some situations, because in others they may unexpectedly prove to be your burden.
AH! ASAN strikes again! Nice find Miblo :)