Image credit: analognowhere.com

Tarme: A technical deep dive

It’s not the first time I talk about Tarme. In fact, the very first post on this blog was about Tarme. However, I’ll be the first to say that post was, to put it nicely, pretty crap.

So, just like anyone who is definitely not struggling with ideas, it’s time for a redux!

Why Tarme?

The reason for making Tarme was quite simple: because I wanted to :3

Okay, it was a bit more than that. Before starting work on Tarme, I had already tried some existing status generators, namely i3blocks and bumblebee-status. And y'know what? I like neither. i3blocks mostly did the job, and the fact blocks could be written in any language also means I could quite easily create my own blocks in whatever languages I felt like using, so I did just that.

But as I wrote my own blocks, i3blocks' limitations became rather obvious, and, while none of them necessarily made what I wanted to do impossible, I definitely wanted that extra bit of space to stretch my legs.

Technical choices

I decided to write Tarme in Chicken Scheme pretty much for a single reason: I was messing around with it at the time and wanted a bigger project to really see what it’s like to work with it.

Next, the architecture! I started by reading the i3bar-protocol. The protocol uses JSON over stdio for IPC between the status generator and the panel. Each update is a JSON array of objects, where each object represents one block in the panel.

With that in mind, Tarme would work in much the same way: storing the blocks in a list, where each block was a lambda that returned a Scheme association list (a decently close analog to JSON objects). From there, for each update I’d call all lambdas and assemble the final list to send to the panel. It was a simple architecture, but a bit too simple. What if I wanted some block properties to be constant? And what if I wanted some blocks to have local data? While there are ways to fix both of those issues using only native Scheme constructs, at the time I opted for turning blocks into SRFI-99 records with 4 fields:

  • static - this one would hold a set of static default values for the block which would be prepended to whatever the block’s function returns
  • data - a field that Tarme doesn’t use but that the block may use however it sees fit; more on this later
  • update - a lambda that would be called by Tarme each time the block status is updated
  • click-events - meant to hold click events for each block; more on this later as well

My first regret: private block data

While most of this architecture is fine, there is one thing that I’ll say right now was not the best choice, and that’s the data field. Really, this choice gives away my inexperience with Scheme and lexical scoping at the time. As anyone used to Scheme and other lisps would know there’s a far better way to have private data, only accessible from within a single lambda, and that is conserved between calls of said functions, but I’m getting ahead of myself.

But first, why is this so bad? In any other language, this would’ve been fine, in fact it was largely meant to emulate the way Python object fields work. However, in practice, it ends up being rather clunky in Scheme if you need to store more than a single value in the data field.

For a concrete example: say you have a block that counts how many times it has been called, then returns that number as the text to be displayed by the panel. Such a block would look as follows

 
(Block data: '((calls . 0))
       update: (lambda (self)
                 (alist-update! 'calls
                                (add1 (alist-ref 'calls (block-data self)))
                                (block-data self))
                 `((full_text . ,(alist-ref 'calls (block-data self))))))
 

That’s quite a lot of code just to increment a single value. Now in this specific case you could do away with the alist and store the value directly under the data field of the block record, but the moment you need more than one value stored you’re right back to this madness.

So, what’s this magical better way to do it that I alluded to earlier?

Closures.

Explaining closures goes a bit out of the scope of this post, but what you need to know is it (ab)uses lexical scoping to create the same effect as static variables in other languages.

So what does that same block from above look like when using closures for the static data?

 
(Block update: (let ((calls 0))
                (lambda (self)
                  (set! calls (add1 calls))
                  `((full_text . ,calls)))))
 

Quite a bit smaller and cleaner, isn’t it? In fact, it ends up being more flexible as well, since this also allows for cleaner loading of external code by any given block without affecting other blocks or Tarme itself, as it is all contained within the local scope. In other words…

I see this as an absolute win!

Perpetually Coming Soon™

If you’ve kept up with Tarme since I first made its development public on my fediverse account, you’ll know there’s one feature that I’ve been postponing over and over, and that’s click events. I… have no excuse for this. There is an implementation currently in the code, and it has been there for months now, but I could never be bothered to test and fix it, instead marking it as “not done” in the README.

Anyway, moving on!

Some SNAFUs along the way

Like any sufficiently complex piece of software, there have been quite a bit of weird bugs I’ve caused and fixed, so here are a few of the notable ones!

Weird threading blocking io

The i3bar-protocol that Tarme is built against has data flowing both ways, and so I thought it would be a great idea to have 2 threads. The main thread would wait for input from the panel, do its thing with said input, then wait for the next input. Pushing updates to the panel would then be done using signal handlers: the update function was assigned to handle the alarm POSIX signal, and I’d just make Tarme send itself the alarm signal every second. There were a few issues with this whole architecture, the first one being that input would never reach the main thread, no matter how many events the panel sent.

The fix was quite simple: instead of trying to read till there’s actually something to read, I just check if there’s anything to even be read from (current-input-port), and read if there is, or skip if there isn’t.

Update clock drift

Also related to the above architecture: there was no way to set the alarm signal to be sent every 1 second exactly, instead what I did was schedule the signal to be sent 1 signal after an update was done. Needless to say, this caused the update cycle to drift like it was driving in Initial D, so a different approach was needed.

The solution ended up being a complete rearchitecting of the way threads were done here. Instead of having 2 threads, everything would happen in a single thread. As stated above, in the first half of the update loop, it would check if there’s any input in (current-input-port) and, if there is, read it and do what’s appropriate with it, otherwise just skip to the next part, that being updating the blocks and sending them to the panel. Timing was also replaced by the SRFI-18 thread-sleep! which puts the thread to sleep until a certain time, instead of for a certain time. This would make it easy to have consistent timing as this allows just putting the thread to sleep until exactly 1 second after the last time it woke up, regardless of when the thread is actually put to sleep.

Inconsistent update timing

It wasn’t actually related to timing.

If you know anything about Scheme, you know it is a garbage-collected language. Normally, this isn’t an issue, and it most definitely shouldn’t be here since Tarme has plenty of time to spare for garbage collection.

…right?

As it turns out, when the thread is put to sleep, the garbage collection is also put to sleep. This caused the GC to kick in during the few miliseconds when Tarme was actually running, causing updates to constantly deliver too late.

The fix feels a little crude but has worked surprisingly well, and that’s to manually call a major garbage collection at the end of an update, before putting the thread to sleep. This ensures the update work is done on time and there’s never enough garbage piling up to trigger a garbage collection mid-update.

The update rate going completely insane

One day, I just took my laptop with me, used it a bunch, then put it to sleep. When I got back home a few hours later, I took it out of the bag, turned it on, and saw Tarme updating way too damn quickly. Took a bit of testing to see what the issue really was, but eventually I figured it out.

As stated earlier, Tarme goes to sleep until exactly 1 second after the last time it went to sleep. But what happens if the last time it went to sleep was several hours ago because the whole system went to sleep in the meantime? In that case, Tarme tries to go to sleep, sees the time it set to wake up is in the past, and just keeps going. This is fine if it’s just a few updates it missed, but when it’s several hours worth of updates it ends up updating as fast as it can trying to catch up to the clock, using up a lot of CPU time in the process.

Much like earlier, the solution was quite simple: before putting the thread to sleep, check if the time it is set to sleep is severely behind, and, if it is, sync to the current system clock.

What have we learned?

Personally, I’ve learned a lot about Chicken Scheme, POSIX IPC, and scripting command line tools to fetch information or run some action. And of course, I also learned that technical debt is a very real thing even in very small programs. As small as Tarme is (343 lines of code according to wc -l and not counting any blocks or configurations), it has already undergone a few refactors, and there are still changes I wish to make to lift limitations and clean things up.

As for what you’ve learned from this, Idk, I seem to have an audience that’s either smarter than me about programming, or that has absolutely zero cares to give about programming. :P