LocalPaper

Illustration

Ever since I got my first Be Book reader I was a fan of epaper displays. So, when I saw a decently looking Trmnl dashboard, I was immediatelly drawn to it. It’s a nice looking device with actually great dashboard software. It’s not perfect (e.g. darn display updates take ages), but there is nothing major I mind. It even has option to host your own server and half decent instructions on how to get it running.

However, even with all this goodness, I decided to roll my own software regardless. Reason? Well, my main beef was that I just wanted a simple list of events for today and tomorrow. On left I would have general family events while on right each of my kids would get their own column. I did try to use calendar for this but main issue was that it was setup as calendar and not really as event list I wanted. Also, it was not really possible to filter just based on date and not on time. For example, I didn’t want past events to dissapear until day is over. And I really wanted a separate box for tomorrow. Essentially, I wanted exactly what is depicted above.

To make things more complicated, I also didn’t want to use calendars for this. Main reason was because it was rather annoying to have all events displayed. For example, my kids’ school schedule is not really something they will enter in calendar. That would make calendar overcrowded for them. As for me, my problem was oposite one as I quite often have stuff in calendar that nobody else cares about and that would just make visual mess (e.g. dates for passport renewal are entries in my calendar). And yes, most of these issues could be sorted by a separate calendar for dashboard. But I didn’t really like that workflow and, most importantly, it wouldn’t show what happens the next day. And that is something my wife really wanted.

With all this I figured I would spend less time rolling my own solution than creating plugins.

Thankfully, Trmnl was kind enough to anticipate the need for custom server in their device setup. Just select your custom destination and you’re good. Mind you, as an opensource project, it would be simple enough to change servers on your own but actually having it available in their code does simplify future upgrades.

On server side, you just need three URLs to server.

The initial one is /api/setup. This one gets called only when device is first time pointed toward the new server and its purpose is to setup authentication keys. Because this was limited to my home network and I really didn’t care, I simply just respond 200 with a basic json.

{
  "status": 200,
  "api_key": "12:34:56:78:90:AB",
  "friendly_id": "1234567890AB",
  "image_url": "http://10.20.30.40:8084/hello.bmp",
  "filename": "empty_state"
}

Once device is setup, the next API call it makes will probably be /api/log. This one I ignore because I could. :) While device is sending the request, it doesn’t care about the answer. At the time I wrote this, even Trmnl own API documentation didn’t cover what it does. While they later did update documentation, I didn’t bother updating the code since 404 works here just fine and data provided in this call is actually also available in the next one.

In /api/display device actually asks you at predefined intervals what to do. Response gives two important pieces of information to the device: where is image to draw and when should I ask for the next image. The next image is easy - I decided upon 5 minutes. You really cannot do it more often as every refresh flashes the screen. Probably the only thing I really hate and actualy the one that will be solved eventually since there is no reason to do a full epaper reset on every draw. But, even once that is solved, you don’t want to do it more often because you will drain your battery. With 5 minute interval it will last you a month. I could have used longer intervals but then any update I make wouldn’t be visible for a while. Month is good enoough for me. To keep things simple, for the file name I just gave specially formatted URL that my software will process later.

{
  "status": 0,
  "image_url": "http://10.20.30.40:8084/A085E37A1984_2025-06-01T04-55-00.bmp",
  "filename": "A085E37A1984_2025-06-01T04-55-00.bmp",
  "refresh_rate": 300,
  "reset_firmware": false,
  "update_firmware": false,
  "firmware_url": null,
  "special_function": "identify"
}

And finally, the last part of API was actually providing the bitmap. Based on the file name and time requested, I would simply generate one on the fly. But you cannot just give it any old bitmap - it has to be 1-bit bitmap (aka 1bpp). And none of the C# libraries supports it out of box. Even SkiaSharp that is capable of doing any manipulation you can imagine simply refuses to deal with something that simple. After trying all reasonably popular graphic libraries only to end up with the same issue, I decided to simply go over the bits in for loop and output my own raw bitmap bytes. Ironically, I spent less time on that then what I spent on testing different libraries. In essence, bitmap Trmnl device wanted has 62 byte header that is followed by simple bit-by-bit dump of image data. You can check Get1BPPImageBytes function if you are curious.

And that is all there is to API. Is it perfect? No. But it is easy to implement. The only pet peeve I have with it is not really the API but device behavior in case of missing server. Instead of just keeping the old data, it goes to an error screen. While I can see the logic, in my case where 95% of time nothing changes on display, it seems counter-productive. But again, I can see why some people would prefer fresh error screen to the old data. To each their own, I guess. Second issue I have is that there is no way to order device NOT to update. For example, if my image is exactly the same as the previous one, why flash the screen? But, again, those are minor things.

After all this talk, one might ask - what about data? Well, all my data comes in the form of pseudo-ini files. The main one being the configuration that setups what goes where. The full example, is on GitHub, I will just show interesting parts here.

[Events.All]
Directory=All
Top=0
Bottom=214
Left=0
Right=265

[Events.Thing1]
Directory=Thing1
Top=0
Bottom=214
Left=267
Right=532

[Events.Thing2]
Directory=Thing2
Top=0
Bottom=214
Left=534
Right=799

[Events.All+1d]
Directory=All
Offset=24
Top=265
Bottom=479
Left=0
Right=265

[Events.Thing1+1d]
Directory=Thing1
Offset=24
Top=265
Bottom=479
Left=267
Right=532

[Events.Thing2+1d]
Directory=Thing2
Offset=24
Top=265
Bottom=479
Left=534
Right=799

Then in each of those directories, you would find something like this.

[2025-05-26]
Lunch=Šnicle
Lunch=Krumpir salata
Lunch=Riža

[2025-05-27]
Lunch=Piletina na lovački
Lunch=Pire krumpir
Lunch=Zelena salata

Each date gets its own section and all entries underneath it are displayed in order. Even better, if they have the same key, that is used as a common header. So, the “Lunch” entries above are all combined together.

Since files are only read when updating, I exposed them on a file share so everybody can put anything on “the wall” by simply editing a text file. Setup is definitelly something that is not going to fit many people. I would almost bet that it will fit only me. However, that is a beauty of being a developer. Often you get to scratch the itch only you have.


You can find my code on GitHub. If you want to test it yourself, docker image is probably the easiest way to do so.

Getting SkiaSharp Running Under Alpine Linux

While I am not using Alpine Linux for my desktop environment, I love it in containers. And C# pairs with it like a dream. Just compile it using linux-musl-x64 runtime and you’re golden.

But, ocassionally, I do have a situation where my application is running fine on Kubuntu while it just crashes on Alpiine Linux. This time, crashes were coming from SkiaSharp.

Unhandled exception. System.TypeInitializationException: The type initializer for 'SkiaSharp.SKImageInfo' threw an exception.
 ---> System.DllNotFoundException: Unable to load shared library 'libSkiaSharp' or one of its dependencies. In order to help diagnose loading problems, consider using a tool like strace. If you're using glibc, consider setting the LD_DEBUG environment variable:
Error loading shared library libfontconfig.so.1: No such file or directory (needed by /app/bin/libSkiaSharp.so)
Error loading shared library libSkiaSharp.so: No such file or directory
Error loading shared library /app/bin/liblibSkiaSharp.so: No such file or directory
Error loading shared library liblibSkiaSharp.so: No such file or directory
Error loading shared library /app/bin/libSkiaSharp: No such file or directory
Error loading shared library libSkiaSharp: No such file or directory
Error loading shared library /app/bin/liblibSkiaSharp: No such file or directory
Error loading shared library liblibSkiaSharp: No such file or directory

The first error is obvious: I was missing a fontconfig package. To install it, just do the standard APK stuff:

apk add fontconfig ttf-dejavu

And yes, I am not only installing fontconfig butw also ttf-dejavu. Alpine is so lightweigth that it comes without any fonts. I like DejaVu, so I decided to go with it. You can make your own font choices but don’t forget to install some if your application requires them.

But it took me a while to figure out rest of the issues since now I faced a bit more puzzling exception:

 ---> System.DllNotFoundException: Unable to load shared library 'libSkiaSharp' or one of its dependencies. In order to help diagnose loading problems, consider using a tool like strace. If you're using glibc, consider setting the LD_DEBUG environment variable:

No matter what I did, I kept getting one set of error or another. And issue seemed to stem from SkiaSharp having glibc dependencies. Since Alpine Linux uses completely different musl library, one of rare thing you cannot install is glibc.

At moment of desperation, I was even looking to compile it from source myself since that seemed to be something people had luck with. And then, on NuGet I noticed there is another package available: SkiaSharp.NativeAssets.Linux.NoDependencies. This package is a direct replacement for SkiaSharp.NativeAssets.Linux, the only difference being it includes its dependencies on libpthread, libdl, libm, libc, and ld-linux-x86-64. Essentially it includes all dependencies except for fontconfig that I already added to my docker image.

So, I added this dependency to my project and SkiaSharp happily worked ever after.

Making CSS URL Unique

I already wrote how I switched to 11ty. For most of time, I am still in the honeymoon phase. Essentially the only major issue (still) remaining is missing public comment system. Whether I will solve this or not, remains to be seen.

But the most annoying issue for me was not really 11ty’s fault. At least not completely. It was more interaction between 11ty, CloudFlare caching layer, and my caching policies.

You see, for site’s performance, it’s quite beneficial to server CSS files from cache whenever possible. If you can cache it, you don’t need to transfer it. That means site is that much faster. And that is good.

However, due to multiple reasons that include preloading and long life I give to CSS files, this also means that my CSS changes don’t always propagate immediately. And thus, my CSS changes would occasionally work just fine locally, only to be invisible on public internet. At least until cache expires or all layers agree that loading the new version is in order.

My workaround was simple - every time I significantly changed CSS file, I would also change its name, thus forcing the update. But I wanted something that can be more easily automated. So, I decided to use query string.

My site doesn’t really use any query strings. But caching layers don’t know that. If they see a file with a new query string, they will treat it as a completely new cache entry. So, I added a step to my build process that will add random query string to each CSS file. Something like this:

CSS_UNIQ=$(date +%s | md5sum | cut -d' ' -f1)
find ./_site -type f -name "*.html" -exec \
  sed -Ei 's|(link rel="stylesheet" href="/\S+\.css)"|\1?id=$CSS_UNIQ"|g' {} +

This step comes after my 11ty site has already been built. My unique ID is just a current time. In order to make it a slightly obscure, I hash it. For caching purposes, this hasing is completely unnecessary. But, to me, seeing hash instead of integer just looks nicer so I use it. You can use whatever you want - be it git commit, hash of a CSS file (not a bad idea, actually), or any other reasonably unique source. Remember, we don’t really need to have it cryptographically secure - just different from run to run.

With ID in hand, using find, we go over each .html file and update its CSS links. This is where sed comes in - essentially any CSS that is part of a link element will just ?id= appended to it.

This code can be improved. One example already mentioned is tying ID to a hash of CSS file. Another might be just not updating ID if CSS files haven’t changed. And probably many more other optimizations that will help. But this code is a good starting point that can be adjusted to fit your site.

Lock Object

Lock statement existed in C# from the very beginning. I still remember the first example.

lock (typeof(ClassName)) {
    // do something
}

Those who use C# will immediatelly yell how perilous locking on the typeof is. But hey, I am just posting an official Microsoft’s advice here.

Of course, Microsoft did correct their example (albeit it took them a while) to now common (and correct) pattern.

private object SyncRoot = new object();lock (SyncRoot) {
    // do something
}

One curiosity of C# as a language is that you get to lock on any object. And here we just, as a convention, use the simplest object there is.

And yes, you can improve a bit on this if you use later .NET versions.

private readonly object SyncRoot = new();lock (SyncRoot) {
    // do something
}

However, if you are using C# 9 or later, you can do one better.

private readonly Lock SyncRoot = new();lock (SyncRoot) {
    // do something
}

What’s better there? Well, for starters we now have a dedicated object type. Combine that with a code analysis and now compiler can give you a warning if you make a typo and lock onto something else by accident. And also, …, wait …, wait …, yep, that’s it. Performance in all of these cases (yes, I am exluding typeof one) is literally the same.

As features go, this one is small and can be easily overlooked. It’s essentially just a syntatic sugar. An I can never refuse something that sweet.

Modulo or Bitwise

I had an interesting thing said to me: “Did you know that modulo is much less efficient than bitwise comparison?” As someone who spent time I painstakingly went through all E-series resistor values to find those that would make my voltage divider be power of 2, I definitely saw that in action. But, that got me thinking. While 8-bit PIC microcontroller doesn’t have a hardware divider and thus any modulo is a torture, what about modern computers? How much slower do they get?

Quick search brought me a few hits and one conclusive StackOverflow answer. Searching a bit more brought me to another answer where they even did measurements. And difference was six-fold. But I was left with a bit of nagging as both of these were 10+ years old. What is a difference you might expect on a modern CPU? And, more importantly for me, what are differences in C#?

Well, I quickly ran some benchmarks and results are below.

TestParallelMeanStDev
(i % 4) == 0No202.3 us0.24 us
(i & 0b11) == 0No201.9 us0.12 us
(i % 4) == 0CpuCount206.4 us7.78 us
(i & 0b11) == 0CpuCount196.5 us5.63 us
(i % 4) == 0CpuCount*2563.9 us7.90 us
(i & 0b11) == 0CpuCount*2573.9 us6.52 us

My expectations were not only wrong but slightly confusing too.

As you can see from table above, I did 3 tests, single threaded, default parallel for, and then parallel for loop with CPU overcommitment. Single threaded test is where I saw what I expected but not in amount expected. Bitwise was quite consistently winning but by ridiculous margins. Unless I was doing something VERY specific, there is no chance I would care about the difference.

If we run test in Parallel.For, difference becomes slightly more obvious. And had I stayed just on those two, I would have said that assumption holds for modern CPUs too.

However, once I overcommitted CPU resources, suddely modulo was actually better. And that is something that’s hard to explain if we take assumption that modulo just uses divide to be true.

So, I decided to sneak a bit larger peek - into .NET CLR. And I discovered that bitwise operation was fully omitted while modulo operation was still there. However, then runtime smartly decided to remove both. Thus, I was testing nothing vs. almost nothing.

Ok, after I placed a strategic extra instructions to prevent optimization, I got the results below.

TestParallelMeanStDev
(i % 4) == 0No203.1 us0.16 us
(i & 0b11) == 0No202.9 us0.06 us
(i % 4) == 0CpuCount1,848.6 us13.13 us
(i & 0b11) == 0CpuCount1,843.9 us6.76 us
(i % 4) == 0CpuCount*21,202.7 us7.32 us
(i & 0b11) == 0CpuCount*21,201.6 us6.75 us

And yes, bitwise is indeed faster than modulo but by really low margin. The only thing new test “fixed” was that discrepancy in speed when you have too many threads.

Just to make extra sure that the compiler wasn’t doing “funny stuff”, I decompiled both to IL.

ldloc.1
ldc.i4.4
rem
ldc.i4.0
ceq
ldloc.1
ldc.i4.3
and
ldc.i4.0
ceq

Pretty much exactly the same, the only difference being usage of and for bitwise check while rem was used for modulo. In modern CPUs these two instructions seem pretty much equivalent. And when I say modern, I use that lossely since I saw the same going back a few generations .

Interestingly, just in case runtime changed those to the same code, I also checked modulo 10 just to confirm. That one was actually faster than modulo 4. That leads me to believe there are some nice optimizations happening here. But I still didn’t know if this was .NET framework or really something CPU does.

As a last resort, I went down to C and compiled it with -O0 -S. Unfortunately, even with -O0, if you use % 4, it will be converted for bitwise. Thus, I checked it against % 5.

Bitwise check compiled down to just 3 instructions (or just one if we exclude load and check).

movl	-28(%rbp), %eax
andl	$3, %eax
testl	%eax, %eax

But modulo went crazy route.

movl	-28(%rbp), %ecx
movslq	%ecx, %rax
imulq	$1717986919, %rax, %rax
shrq	$32, %rax
movl	%eax, %edx
sarl	%edx
movl	%ecx, %eax
sarl	$31, %eax
subl	%eax, %edx
movl	%edx, %eax
sall	$2, %eax
addl	%edx, %eax
subl	%eax, %ecx
movl	%ecx, %edx
testl	%edx, %edx

It converted division into multiplication and gets to remainder that way. All in all, quite impressive optimization. And yes, this occupies more memory so there are other consequences to the performance (e.g. uses more cache memory).

So, if you are really persistent with testint, difference does exist. It’s not six-fold but it can be noticeable.

At the end, do I care? Not really. Unless I am working on microcontrollers, I won’t stop using modulo where it makes sense. It makes intent much more clear and that, to me, is worth it. Even better, compilers will just take care of this for you.

So, while modulo is less efficient, stories of its slowness have been exaggerated a bit.

PS: If you want to run my tests on your system, files are available.