- ↔
- →
- Felix' Blog - A Review of Helix after 1.5 Years
- HYTRADBOI 2025
- Hot Tub Monitoring with Home Assistant and ESPHome · Jon Seager
- Initial launch: Nuanced call graph context layer for AI coding tools | Nuanced.dev
- Actual LLM agents are coming | Vintage Data
- March 20, 2025
-
🔗 mhx/dwarfs dwarfs-0.11.2 release
Bugfixes
- macOS Ventura's version of clang appears to be missing an implementation of
std::hash<std::filesystem::path
, making it hard to define anunordered_map<filesystem::path>
. Work around by simply using anunordered_map<string>
instead. - Installing the binaries using cmake did not honor the
CMAKE_INSTALL_BINDIR
orCMAKE_INSTALL_SBINDIR
variables. Fixes github #253.
Full Changelog :
v0.11.1...v0.11.2
SHA-256 Checksums
61fce8eaa6bbdf10917a5a12331e192748a54ab1aa175ed6f55cb26825ab3177 dwarfs-0.11.2-Linux-aarch64-clang-reldbg-stacktrace.tar.xz 06fc4ed91ee5c348dbfc70771fe3e3ea6834277e4a58f1f99e0bc98cb16ed3d4 dwarfs-0.11.2-Linux-aarch64-clang.tar.xz 15905007cff432bb9be0bdabed93473764c1706796e0da6f3af083f0a142db6d dwarfs-0.11.2-Linux-x86_64-clang-reldbg-stacktrace.tar.xz 3c82708e00af9d1622e78047efd216e4e29213a60aff3afa8326bade8353ea38 dwarfs-0.11.2-Linux-x86_64-clang.tar.xz 1b38faf399a6d01cd0e5f919b176e1cab76e4a8507088d060a91b92c174d912b dwarfs-0.11.2.tar.xz 8a028693ce0a7ab083b25dc491b100f41fbf98f28413a38f6773fe1cf27574fb dwarfs-0.11.2-Windows-AMD64.7z 600134267dd0ad51dd9d8bd1b58fa614b0a0da9a7a3d57f5fce4dbda9bb80460 dwarfs-universal-0.11.2-Linux-aarch64-clang a9f5f79afeff4eba5cc23893de46e4c8eaa3b51b8f5938ed7f9e6cb92560fa4f dwarfs-universal-0.11.2-Linux-aarch64-clang-reldbg-stacktrace 1bee828de84c1a3a1c2134bc866f28bdf93a62927cb7e8c416813f389f7745ad dwarfs-universal-0.11.2-Linux-x86_64-clang ddbd62d3bf0bf420a1720af6c03bea21ce6a77a73cedc46f28c8d79e4ac26827 dwarfs-universal-0.11.2-Linux-x86_64-clang-reldbg-stacktrace d95dab93a7e9d8349d4a4393213a401d9ded79040b1c48e661df7dfe118b72a7 dwarfs-universal-0.11.2-Windows-AMD64.exe
- macOS Ventura's version of clang appears to be missing an implementation of
-
🔗 sacha chua :: living an awesome life Playing with chunk size when writing rss
How long is a blog post? Some people write short posts with one clear thought. Others write longer essays.
I tend to start out writing a short post and then get distracted by all the rabbit-holes I want to go down. Drafting my thoughts on blogging leads to adding lots of blogs to my reader, writing some code that takes an OPML and makes a table of blogs and their most recent posts, fixing the org-html-themes setup for my Emacs configuration, breaking out this chunk as its own post, drawing a bunch of mindmaps, doing a braindump, tweaking my workflow for processing braindumps to use faster-whisper and whisper-ctranslate2 instead of WhisperX because of this issue, so that I can try the whisper-large-v3-turbo model, experimenting with workflows for reviewing the PDF on the iPad… Definitely lots of yak-shaving (wiktionary definition). I still want to write that post. I already have the sketch I want to include in it. It's like Chilli in the Bluey episode Sticky Gecko (script): "The door: It is right here. All we need to do is walk out of it: it's so easy!" The thought! It's right there! Just get to it, brain! But I wander because I wonder. I suppose that's all right.
It might be fun to play around with the sizes of things I share: shorter when my attention is fragmented or squirrely, longer when I can think about something over several days or years. Here are some ways to tinker with that.
Breaking thoughts down into smaller chunks so I can get them out the door:
- When I notice that something is a big blog post (like this reflection I've been working on about blogging), I can break out parts of it into their own blog posts and then replace that section with links.
- I can post interesting quotes and snippets to Mastodon and then round them up periodically or refer to them in blog posts. TODO: It might be good to have a shortcut for an accessible link to a toot using a speech bubble or similar icon.
Taming my tangents and ideas: I'm sometimes envious of blogs with neat side notes, but really, I should just accept that the tangents that my mind wants to go on can take paragraphs and are more suited to, say, collapsible details or a different blog post. Something I can experiment with: instead of gallivanting off on that tangent (soo hard to resist when there's an idea for an Emacs tweak!), I can add a TODO and leave it for my future self. Maybe even two TODOs: one inline, where it makes sense in the text; and one in my Org Mode, with a link to the blog post so that I can go back and update it when (if!) I get around to it. Who knows, maybe someone might comment with something that already exists.
Saving scraps: It's easier to cut out half-fleshed-out ideas if I tell myself I'm just saving them somewhere. Right now I capture/refile them to a scraps heading, but there's probably a better way to handle this. Maybe I can post some thoughts to Mastodon and then save the toot URL. Maybe I can experiment with using Denote to manage private notes.
Connecting thoughts and building them up:
- I tend to write in small chunks. (TODO: I could probably do some kind of word-count analysis, might be neat.) Sketchnotes and hyperlinks might help me chunk thoughts so I can think about bigger things. I can link to paragraphs and text fragments, so I can connect thoughts with other parts of thoughts instead of trying to get the granularity right the first time around. The shortcuts I made for linking to blog posts and searching the Web or my notes are starting to help.
- I sporadically work on topic maps or indices. Maybe I'll gradually flesh them out into a digital garden / personal wiki.
- Sometimes I don't remember the exact words I used. Probabilistic search or vector search might help here, too. I don't need an AI-generated summary, I just want a list of related posts.
- I can figure out how to add backlinks to my blog, or simplify the workflow for adding links to previous posts. Maybe something based on this guide for 11ty or binyamin/eleventy-plugin-backlinks. I might need to write something custom anyway so that I can ignore the links coming from monthly/weekly review posts.
Connecting to other people's thoughts: For the purposes of conversation, it'll probably be good to let people know if I write something about their blog post. Doesn't happen automatically. Pingbacks and referrer logs got too swamped by spam a long time ago, so I don't think anyone really uses them. Idea: It might be neat to have something that quickly lists all the external links in a post, and maybe a way to save the e-mail addresses or Mastodon handles for people after I look them up so that I can make that even smoother, and some kind of quick template. I can send email and toot from within Emacs, so that's totally doable… (No, I am not going to write it right now, I'm going to add it to my to-do list.)
(Also, there's another thought here about books and The Great Conversation, and blogs and smaller-scale conversations, and William Thurston and mathematicians and understanding, and cafes…)
Hmm. I think that getting my brain to make smaller chunks and get them out the door will be a good thing to focus on. Synthesizing can come later.
-
🔗 Console.dev newsletter Konva rss
Description: JS 2D Canvas.
What we like: Framework for building animations and graphics on a 2D HTML5 canvas. Works well with React, Vue, Svelte. Dynamic animations, tweens, pre-built filters, & node management built in. Canvas can be exported in high quality to data URLs or images.
What we dislike: React integration doesn’t support React Native, but Konva itself works across platforms.
-
🔗 Console.dev newsletter Goravel rss
Description: Go web application framework.
What we like: Designed to be consistent with Laravel to make migration easy. Lots of built in modules e.g. AuthN and AuthZ, routing, middleware, gRPC, session, queues, validation, logging, etc. Includes an ORM natively integrated with seeding and migrations. Plugins for extending functionality.
What we dislike: There were breaking changes between minor version releases (v1.13 to v1.14 and v1.14 to v1.15).
-
- March 19, 2025
-
🔗 Jeremy Fielding (YouTube) A Critical Piece of Machinery has Failed. rss
If you want to join my community of makers and Tinkers consider getting a YouTube membership 👉 https://www.youtube.com/@JeremyFieldingSr/join
If you want to chip in a few bucks to support these projects and teaching videos, please visit my Patreon page or Buy Me a Coffee. 👉 https://www.patreon.com/jeremyfieldingsr 👉 https://www.buymeacoffee.com/jeremyfielding
Social media, websites, and other channel
Instagram https://www.instagram.com/jeremy_fielding/?hl=en Twitter 👉https://twitter.com/jeremy_fielding TikTok 👉https://www.tiktok.com/@jeremy_fielding0 LinkedIn 👉https://www.linkedin.com/in/jeremy-fielding-749b55250/ My websites 👉 https://www.jeremyfielding.com 👉https://www.fatherhoodengineered.com My other channel Fatherhood engineered channel 👉 https://www.youtube.com/channel/UC_jX1r7deAcCJ_fTtM9x8ZA
Notes:
Technical corrections
Nothing yet
-
🔗 sacha chua :: living an awesome life Reading more blogs; Emacs Lisp: Listing blogs based on an OPML file rss
Nudged by Dave Winer's post about old-school bloggers and my now-nicely-synchronizing setup of NetNewsWire (iOS) and FreshRSS (web), I gave Claude AI this prompt to list bloggers (with the addition of "Please include URLs and short bios.") and had fun going through the list it produced. A number of people were no longer blogging (unreachable sites or inactive blogs), but I found a few that I wanted to add to my feed reader.
Here is my people.opml at the moment (slightly redacted, as I read my husband's blog as well). This list has some non-old-school bloggers as well and some sketchnoters, but that's fine. It's a very tiny slice of the awesomeness of the Internet out there, definitely not exhaustive, just a start. I've been adding more by trawling through indieblog.page and the occasional interesting post on news.ycombinator.com.
It makes sense to make an HTML version to make it easier for people to explore, like those old-fashioned blog rolls. Ooh, maybe some kind of table like indieblog.page, listing a recent item from each blog. (I am totally not surprised about my tendency to self-nerd-snipe with some kind of Emacs thing.) This uses my-opml-table and my-rss-get-entries, which I have just added to my Emacs configuration.
my-opml-table(defun my-opml-table (xml) (sort (mapcar (lambda (o) (let ((latest (car (condition-case nil (my-rss-get-entries (dom-attr o 'xmlUrl)) (error nil))))) (list (if latest (format-time-string "%Y-%m-%d" (plist-get latest :date)) "") (org-link-make-string (or (dom-attr o 'htmlUrl) (dom-attr o 'xmlUrl)) (replace-regexp-in-string " *|" "" (dom-attr o 'text))) (if latest (org-link-make-string (plist-get latest :url) (or (plist-get latest :title) "(untitled)")) "")))) (dom-search xml (lambda (o) (and (eq (dom-tag o) 'outline) (dom-attr o 'xmlUrl) (dom-attr o 'text))))) :key #'car :reverse t))
my-rss-get-entries: Return a list of the form ((:title … :url … :date …) …).(defun my-rss-get-entries (url) "Return a list of the form ((:title ... :url ... :date ...) ...)." (with-current-buffer (url-retrieve-synchronously url) (set-buffer-multibyte t) (goto-char (point-min)) (when (re-search-forward "<\\?xml\\|<rss" nil t) (goto-char (match-beginning 0)) (sort (let* ((feed (xml-parse-region (point) (point-max))) (is-rss (> (length (xml-get-children (car feed) 'entry)) 0))) (if is-rss (mapcar (lambda (entry) (list :url (or (xml-get-attribute (car (or (seq-filter (lambda (x) (string= (xml-get-attribute x 'rel) "alternate")) (xml-get-children entry 'link)) (xml-get-children entry 'link))) 'href) (dom-text (dom-by-tag entry 'guid))) :title (elt (car (xml-get-children entry 'title)) 2) :date (date-to-time (elt (car (xml-get-children entry 'updated)) 2)))) (xml-get-children (car feed) 'entry)) (mapcar (lambda (entry) (list :url (or (caddr (car (xml-get-children entry 'link))) (dom-text (dom-by-tag entry 'guid))) :title (caddr (car (xml-get-children entry 'title))) :date (date-to-time (elt (car (xml-get-children entry 'pubDate)) 2)))) (xml-get-children (car (xml-get-children (car feed) 'channel)) 'item)))) :key (lambda (o) (plist-get o :date)) :lessp #'time-less-p :reverse t))))
(my-opml-table (xml-parse-file "~/Downloads/people.opml"))
I'm rebuilding my feed list from scratch. I want to read more. I read the aggregated feeds at planet.emacslife.com every week as part of preparing Emacs News. Maybe I'll go over the list of blogs I aggregate there, widen it to include all posts instead of just Emacs-specific ones, and see what resonates. Emacs people tend to be interesting. Here is an incomplete list based on people who've posted in the past two years or so, based on this work-in-progress planetemacslife-expanded.opml. (I haven't tweaked all the URLs yet. I stopped at around 2023 and made the rest of the elements
xoutline
instead ofoutline
so that my code would skip them.)(my-opml-table (xml-parse-file "~/Downloads/planetemacslife-expanded.opml"))
Making this table was fun. It's nice to see a lot of people also writing and learning out loud. This reminded me a little of EmacsConf - 2020 - talks - Sharing blogs (and more) with org-webring. TODO: Could be fun to have a blogroll page again.
I notice I tend to like:
- posts about adapting technology to personal interests, more than posts about the industry or generalizations
- detailed posts about things I'm currently interested in (Emacs, personal knowledge management, some Javascript), more than detailed tech posts about things I've decided not to get into at the moment
- "I" posts more than "You" posts: personal reflections rather than didactic advice
- curiosity, fun, experimentation
Looking forward to discovering more!
-
🔗 mhx/dwarfs dwarfs-0.11.1 release
Bugfixes
- macOS Ventura's version of clang appears to be missing the
<source_location>
header, despite Apple claiming otherwise. Fix this by shipping a wrapper and providing a fallback implementation.
Full Changelog :
v0.11.0...v0.11.1
SHA-256 Checksums
3e1b6331cf2f589d7058700aa2c5dc41f1825f3954f3828eb709034ba57a7c97 dwarfs-0.11.1-Linux-aarch64-clang-reldbg-stacktrace.tar.xz 23b1e0b18a7c3ffeb6c5fcc97ab032a7c1c651454d0fa5cb9741918d97a14ab3 dwarfs-0.11.1-Linux-aarch64-clang.tar.xz 4ec6614e87064ac96dfdb9b7957620bd889f5f1e95416409369d00606cfff0a1 dwarfs-0.11.1-Linux-x86_64-clang-reldbg-stacktrace.tar.xz 1eebf6e66eb5d6dc7cfb9c9b3c7c6e67084acc5ced7a018c15a511e929598f99 dwarfs-0.11.1-Linux-x86_64-clang.tar.xz 7a0cccb1ec3c2a18e9a014893c1d3e1f8f2c44ade6936c9f6d3bab5ec14b2052 dwarfs-0.11.1.tar.xz c43fd9f2089b94ddd2819c7853dd6d8c34951e2d42cbfdc2e4470cde9c3e18fb dwarfs-0.11.1-Windows-AMD64.7z e9bf1f8bcccf363be25396d2f60d9f4e7765eba5bd647f071aa4d0ba5cb3785b dwarfs-universal-0.11.1-Linux-aarch64-clang 22966f1dba98697db0cad127d1d8c50ef5952b5c9816cc2564b9410a37cdbaa3 dwarfs-universal-0.11.1-Linux-aarch64-clang-reldbg-stacktrace 0a025dc0f854ad9f3a5f9ca89ac43ada5305de14fe4b2e03088ce9ee5a23dbf4 dwarfs-universal-0.11.1-Linux-x86_64-clang a38de846f48c7979c204699af6a1fda0d8d634caac68223b2b46fcb1c52c0e56 dwarfs-universal-0.11.1-Linux-x86_64-clang-reldbg-stacktrace b8299e5f2102283c52c1cbd8ad14a0c3b71244937327e895e0aee809d92e4474 dwarfs-universal-0.11.1-Windows-AMD64.exe
- macOS Ventura's version of clang appears to be missing the
-
🔗 streamyfin/streamyfin v0.27.0 release
This release includes stability improvements and new features.
The upgrade to VLC4 introduced performance issues and bugs. As a result, we're reverting to VLC3, which unfortunately means PiP support will no longer be available for iOS users for now.
New features include:
- Sessions view for admins
- Notification support for all kinds of events using the Streamyfin plugin for Jellyfin
- Downgrade to VLC3 for stability - sadly removes PiP for now.
- Mark/unmark your favorite media directly from listings as a quick action
What's Changed
- feat: add japanese translations by @tkymmm in #552
- feat: Sessions view by @lostb1t in #537
- feat: Add Chinese (Simplified) Translation by @vuhe in #556
- fix: Improve Chinese (Traditional) Translation by @Marcio2536 in #557
- feat: focus search bar on second tab press by @fredrikburmester in #558
- fix(511): fixed long named translations for the subtitle can break UI by @sbaiahmed1 in #564
- feat: Mark/unmark favorite quick action by @lostb1t in #561
- fix: makes the icon adaptive for android by @Simon-Eklundh in #569
- fix: Update nl.json by @Little709 in #565
- fix(#566): add Turkish by @sbaiahmed1 in #583
- feat: Add session count to app badge by @lostb1t in #575
- fix: update textContentType for username input to oneTimeCode by @sbaiahmed1 in #587
- feat: Added Ukrainian translation by @ozgreat in #593
- feat: enhance favorites with empty cell && added translations by @sbaiahmed1 in #594
- fix: fixed app crash on next downloaded item && update biome schema v… by @sbaiahmed1 in #610
- feat: add Polish translation and update language options by @sbaiahmed1 in #608
New Contributors
- @tkymmm made their first contribution in #552
- @vuhe made their first contribution in #556
- @Little709 made their first contribution in #565
- @ozgreat made their first contribution in #593
Full Changelog :
v0.26.1...v0.27.0
-
🔗 matklad Comptime Zig ORM rss
Comptime Zig ORM Mar 19, 2025
This post can be considered an advanced Zig tutorial. I will be covering some of the more unique aspects of the language, but won’t be explaining the easy part. If you haven’t read the Zig Language Reference, you might start there. Additionally, we will also learn the foundational trick for implementing relational model.
You will learn a sizable chunk of Zig after this post, but this isn’t going to be an easy read, so prepare your favorite beverage and get comfy!
[On Learning ](https://matklad.github.io/2025/03/19/comptime-zig-
orm.html#On-Learning)
One of the most ridiculously effective ways of learning for me is building toy versions of programs. This is slightly more specific than “to learn to code, code”: I claim that you can learn more by spending a week building your own very bad version of an application from scratch that you’d learn from working full-time for a year on a production ready codebase. Case in point: although the code in this post is lifted from TigerBeetle, and I’ve been working with it for a couple of years, I’ve learned a bunch of new things myself in the evening of hacking on the code for the post.
The hard part about the toy problem approach is finding the right toy! I remember, early in my career, spending about a year pestering everything with “what is your favorite model problem?” question, and not getting a real answer. Until one day @zmacter asked “have you tried a raytracer?” and that became my model problem for learning programming languages. Seriously, if you want to learn Zig, go write yourself a raytracer, I have some notes for that here.
[The Database ](https://matklad.github.io/2025/03/19/comptime-zig-
orm.html#The-Database)
In this post, we’ll work on a solution of a different model problem, which I think is an especially good fit for showcasing Zig’s comptime capabilities. This problem is a simplified version of the LSM Forest code from TigerBeetle.
Specifically, we will be implementing an in-memory relational database, whose schema is set at compile time. Before diving into the implementation, let’s sketch the interface we want.
First, we define objects which we will be storing in our database, accounts and transfers:
const Account = struct { id: ID = @enumFromInt(0), balance: u128, pub const ID = enum(u64) { _ }; }; const Transfer = struct { id: ID = @enumFromInt(0), amount: u128, debit_account: Account.ID, credit_account: Account.ID, pub const ID = enum(u64) { _ }; };
In Zig,
struct
is an expression that yields an anonymous struct type, which needs to be explicitly bound to an identifier:const Account = struct { ... }
Structs contain fields and declarations. Fields can have default values. This curious pattern
pub const ID = enum(u64) { _ };
is a Zig idiom for creating a newtype over an integer.ID
is an enumeration, whose backing type isu64
. This enumeration doesn’t have any explicitly named variants, but it is open (_
) — anyu64
numeric value is considered to be a member. This is exactly what we want for an id — it’s an opaque number with a unique type, whose “numberness” is not exposed (you can’t add two ids together). In the transfer struct, we refer to account id:debit_account: Account.ID
Note that although
Account.ID
andTransfer.ID
have exactly the same definition, they are distinct types. Let this sink in — Zig’s type system is nominal, but all types are anonymous!Ids will be assigned by our database automatically, using an auto-incrementing counter, and we will use zero id to signify a new object without an id assigned yet:
id: ID = @enumFromInt(0),
@enumFromInt
and@intFromEnum
are built-in functions for casting between an enum and its backing integer. It could have been cleaner to instead write:const Account = struct { id: ID = .unassigned, balance: u128, pub const ID = enum(u64) { unassigned = 0, _, }; };
That is, to add an explicitly named variant for zero.
A word on built in functions. In Zig, all compiler builtins use special syntax with
@
. This is somewhat unusual — traditionally, builtins are some magic functions inside the standard library, specially marked. Zig boldly uses a dedicated syntax for builtins, but, in the exchange, the Zig standard library is not privileged at all. It is just a normal library which happens to be distributed with the compiler, it doesn’t have special powers.
A word on type inference. Zig hits a sweet spot:
- explicit types are rarely needed in expressions,
- where they are needed a human reader usually wouldn’t be able to tell the type on the spot anyway,
- function signatures are always explicit,
- and the inference algorithm is direct and simple.
Specifically, Zig doesn’t do Hindley-Milner (separate phases of constraint gathering and solving), and instead directly infers a type of expression from the types of its subexpressions, with a small twist. If the type of the result is known already through other means, it is propagated down. Certain syntaxes and builtins take advantage of this known result type. Consider both flavors of defaulting id to zero:
id: ID = .unassigned, id: ID = @enumFromInt(0),
Here, the result type is known, it is
ID
. So, when Zig evaluates.unassigned
, it knows that the result type must beID
, and “desugars” the shorthand toID.unassigned
(the type needn’t be namable). Similarly, for@enumFromInt
case, the type of an enum to convert to is taken from the result. If the type is not otherwise known from the context, the result is a compilation error which can be fixed with an@as
type ascription builtin:// Compilation error: _which_ enum? const mystery = @enumFromInt(0); const mystery = @as(ID, @enumFromInt(0));
Note how Zig doesn’t need type ascription syntax , and just uses a builtin function.
So, yeah, we have accounts and transfers, they both have ids assigned by the database, an account has a balance, a transfer has an amount and refers to two accounts:
const Account = struct { id: ID = @enumFromInt(0), balance: u128, pub const ID = enum(u64) { _ }; }; const Transfer = struct { id: ID = @enumFromInt(0), amount: u128, debit_account: Account.ID, credit_account: Account.ID, pub const ID = enum(u64) { _ }; };
Now, let’s define our database from the schema:
const DBType = @import("./db.zig").DBType; const DB = DBType(.{ .tables = .{ .account = Account, .transfer = Transfer, }, .indexes = .{ .transfer = .{ .debit_account, .credit_account, }, }, });
A lot is going on here. First, we use
@import
builtin function to import (look, no need for syntax again!) theDBType
function.DBType
is a type constructor — it takes a DB schema, and returns a database type. For the schema, we ask for two tables, accounts and transfers, and also ask to include indexes on transfers’ foreign keys.The implementation of
DBType
is the meat of this post, but, for now, let’s see how we use it. Let’s write a function to add a transfer to the database:fn create_transfer( db: *DB, gpa: std.mem.Allocator, debit_account: Account.ID, credit_account: Account.ID, amount: u128, ) !?Transfer.ID { ... }
Zig doesn’t have a global allocator, so anything that needs to allocate takes an allocator argument.
std.mem.Allocator
is dynamically dispatched: inside it are a type-erased pointer to a particular allocator’s state, and a pointer to a vtable:const Allocator = struct { ptr: *anyopaque, vtable: *const VTable, pub const VTable = struct { alloc: *const fn ( *anyopaque, len: usize, alignment: Alignment, ret_addr: usize, ) ?[*]u8, ... }; };
This is a trait object, coded manually.
gpa
stands for general purpose allocator, which behaves more or less like a global allocator would, as far as the code is concerned. You often seearena: Allocator
, signifying that memory doesn’t need to be freed on per-object basis, orscratch: Allocator
, signifying that the memory can be used for short-lived allocations inside the function, but can’t outlive it.Inserting a new object into our in-memory database could allocate, so we need an allocator argument, and, conversely, need to signal possibility of an allocation failure in our result type, which is what the bang (
!
) is for.Another reason for why the operation might fail is that the transfer itself might be invalid (e.g., insufficient balance). For simplicity, I choose to model this by returning a
null
instead ofTransfer.ID
, hence the question mark (?
).In Zig, types are always specified in prefix notation, without exception. For example,
[3]?struct { r: u8, g: u8, b:u8 }
is an array of three optional colors.Let’s see the implementation of
create_transfer
:if (debit_account == credit_account) return null; const dr = db.account.get(debit_account) orelse return null; const cr = db.account.get(credit_account) orelse return null; if (dr.balance < amount) return null; if (cr.balance > std.math.maxInt(u128) - amount) return null; db.account.update(.{ .id = debit_account, .balance = dr.balance - amount, }); db.account.update(.{ .id = credit_account, .balance = dr.balance + amount, }); return try db.transfer.create(gpa, .{ .debit_account = debit_account, .credit_account = credit_account, .amount = amount, });
Zig doesn’t require braces around if’s body which makes for concise “guard” ifs. It comes with autoformatter out of the box, so there’s very little possibility for indentation confusion.
db.account.get(credit_account)
looks up an account by an id. The account might or might not exist, the return type of this function is?Account
. Zig’sorelse
unpacks optionals. The type ofreturn
expression isnoreturn
(!
from Rust), so the type ofcr
anddr
is justAccount
, without a question mark. Instead ofreturn
ing, we could haveorelse
ed some defaultAccount
.This line is the only place where we need to help type inference by spelling a type explicitly:
if (cr.balance > std.math.maxInt(u128) - amount) return null;
We pass the
u128
type to themaxInt
function. This is the case where a sophisticated smart type inference algorithm could look at the surrounding context and infer the type, but Zig deliberately requires the user to spell it in stations like this.Having done the balance checks, we ask our database to update the two balances, and to persist the new transfer object. Only
transfer.create
calls gets an allocator, so it is immediately obvious that only this part of the function can allocate.[A Usage Example ](https://matklad.github.io/2025/03/19/comptime-zig-
orm.html#A-Usage-Example)
Now, let’s write some test program:
pub fn main() !void { ... }
We are going to allocate, so we might fail, so we return
!void
. So, we’ll need an allocator:var gpa_instance: std.heap.DebugAllocator(.{}) = .{}; const gpa = gpa_instance.allocator();
gpa_instance
is a concrete allocator, of typestd.heap.DebugAllocator(.{})
. It is initialized with default values for all the fields. When evaluating= .{}
, Zig knows the result type, so it knows which fields are defaulted.gpa
is our trait object. Internally, it contains a pointer togpa_instance
. The state ofgpa_instance
will be mutated, so it needs to be declared as avar
. Thegpa
though is just a pair of pointers, and those pointers won’t be mutated themselves, so we can declare itconst
, similarly to how in Rust you’d writelet mut x = 92; let r = &mut r; // no mut on r!
Because Zig doesn’t track aliasing in the type system, figuring out what can get mutated when is generally harder in Zig than in Rust.
For the usage example, we’ll need some random numbers, which follow a similar pattern — a concrete PRNG and a dynamically dispatched trait object / vtable:
var random_instance = std.Random.DefaultPrng.init(92); const random = random_instance.random();
Finally, we create an instance of our database:
var db: DB = .{}; // defer db.deinit(gpa);
DB
will allocate memory, so we absolutely do need a deinit function to free it, but I am excluding it from the tutorial, as it requires some not particularly illuminating legwork.For starters, just create two accounts and a transfer:
const alice: Account.ID = try db.account.create(gpa, .{ .balance = 100 }); const bob: Account.ID = try db.account.create(gpa, .{ .balance = 200 }); const transfer: ?Transfer.ID = try create_transfer(&db, gpa, alice, bob, 100); assert(transfer != null);
So far, this feels like a hash-map with more steps. We aren’t doing anything relational here. We will, soon, but we’ll need some fake data. To keep things a touch more realistic, we won’t be distributing transfers equally between accounts, and instead ensure that 20% of hottest accounts are responsible for 80% of transfers:
fn pareto_index(random: std.Random, count: usize) usize { assert(count > 0); const hot = @divFloor(count * 2, 10); if (hot == 0) return random.uintLessThan(usize, count); if (random.uintLessThan(u32, 10) < 8) { return pareto_index(random, hot); } return hot + random.uintLessThan(usize, count - hot); }
Nothing new here —
@divFloor
is another builtin (an intention-bearing name for/
), and we need to pass the type we want to get out of random explicitly, instead of having type inference (and the human reader) figuring it out.The loop to populate the database is slightly more interesting:
var accounts: std.ArrayListUnmanaged(Account.ID) = .empty; defer accounts.deinit(gpa); const account_count = 100; try accounts.ensureTotalCapacity(gpa, account_count); accounts.appendAssumeCapacity(alice); accounts.appendAssumeCapacity(bob); while (accounts.items.len < account_count) { const account = try db.account.create(gpa, .{ .balance = 1000 }); accounts.appendAssumeCapacity(account); } const transfer_count = 100; for (0..transfer_count) |_| { const debit = pareto_index(random, account_count); const credit = pareto_index(random, account_count); const amount = random.uintLessThan(u128, 10); _ = try create_transfer( &db, gpa, accounts.items[debit], accounts.items[credit], amount, ); }
We’ll need to store generated account ids somewhere, so we use an
ArrayList
. Zig strongly pushes you towards batching your allocations, so we preallocate space for a hundred accounts at once, and then append without passing agpa
in. For simplicity, we don’t implement reservation API for our database, so we do need agpa
when creating an account or a transfer.In Zig, unused return value is a compilation error, so we need
_ =
to ignore the result of the transfer.Finally, we come to the relational part of the tutorial, we’ll do a non- trivial lookup. First, we’ll ask for all transfers from
alice
to anyone, and then for transfers betweenalice
andbob
specifically:var transfers_buffer: [10]Transfer = undefined; const alice_transfers = db.transfer.filter( .{ .debit_account = alice }, &transfers_buffer, ); for (alice_transfers) |t| { std.debug.print("alice: from={} to={} amount={}\n", .{ t.debit_account, t.credit_account, t.amount, }); } const alice_to_bob_transfers = db.transfer.filter( .{ .debit_account = alice, .credit_account = bob }, &transfers_buffer, ); for (alice_to_bob_transfers) |t| { std.debug.print("alice to bob: from={} to={} amount={}\n", .{ t.debit_account, t.credit_account, t.amount, }); }
The interesting parts are highlighted. We can ask the database to filter transfer objects to only those with matching attributes. You can do it using a brute-force loop over all transfers. But, if you are serious about your relational model, you obviously want to be faster than that! I wonder if this has something to do with the indexes we added when declaring
DB
?For Zig specifics, I don’t want to allocate the result, and I don’t want to bother with iterators, so I pass a stack-allocated out buffer in.
[A Call To Action ](https://matklad.github.io/2025/03/19/comptime-zig-
orm.html#A-Call-To-Action)
Here’s what we got so far, the interface for the so-far mysterious
db.zig
:const std = @import("std"); const assert = std.debug.assert; const DBType = @import("./db.zig").DBType; const Account = struct { id: ID = @enumFromInt(0), balance: u128, pub const ID = enum(u64) { unassigned, _, }; }; const Transfer = struct { id: ID = @enumFromInt(0), amount: u128, debit_account: Account.ID, credit_account: Account.ID, pub const ID = enum(u64) { _ }; }; const DB = DBType(.{ .tables = .{ .account = Account, .transfer = Transfer, }, .indexes = .{ .transfer = .{ .debit_account, .credit_account, }, }, }); fn create_transfer( db: *DB, gpa: std.mem.Allocator, debit_account: Account.ID, credit_account: Account.ID, amount: u128, ) !?Transfer.ID { if (debit_account == credit_account) return null; const dr = db.account.get(debit_account) orelse return null; const cr = db.account.get(credit_account) orelse return null; if (dr.balance < amount) return null; if (cr.balance > std.math.maxInt(u128) - amount) return null; db.account.update(.{ .id = debit_account, .balance = dr.balance - amount, }); db.account.update(.{ .id = credit_account, .balance = dr.balance + amount, }); return try db.transfer.create(gpa, .{ .debit_account = debit_account, .credit_account = credit_account, .amount = amount, }); } pub fn main() !void { var gpa_instance: std.heap.DebugAllocator(.{}) = .{}; const gpa = gpa_instance.allocator(); var random_instance = std.Random.DefaultPrng.init(92); const random = random_instance.random(); var db: DB = .{}; // defer db.deinit(gpa); const alice: Account.ID = try db.account.create(gpa, .{ .balance = 100 }); const bob: Account.ID = try db.account.create(gpa, .{ .balance = 200 }); const transfer = try create_transfer(&db, gpa, alice, bob, 100); assert(transfer != null); var accounts: std.ArrayListUnmanaged(Account.ID) = .empty; defer accounts.deinit(gpa); const account_count = 100; try accounts.ensureTotalCapacity(gpa, account_count); accounts.appendAssumeCapacity(alice); accounts.appendAssumeCapacity(bob); while (accounts.items.len < account_count) { const account = try db.account.create(gpa, .{ .balance = 1000 }); accounts.appendAssumeCapacity(account); } const transfer_count = 100; for (0..transfer_count) |_| { const debit = pareto_index(random, account_count); const credit = pareto_index(random, account_count); const amount = random.uintLessThan(u128, 10); _ = try create_transfer( &db, gpa, accounts.items[debit], accounts.items[credit], amount, ); } var transfers_buffer: [10]Transfer = undefined; const alice_transfers = db.transfer.filter( .{ .debit_account = alice }, &transfers_buffer ); for (alice_transfers) |t| { std.debug.print("alice: from={} to={} amount={}\n", .{ t.debit_account, t.credit_account, t.amount, }); } std.debug.print("\n\n", .{}); const alice_to_bob_transfers = db.transfer.filter( .{ .debit_account = alice, .credit_account = bob }, &transfers_buffer, ); for (alice_to_bob_transfers) |t| { std.debug.print("alice to bob: from={} to={} amount={}\n", .{ t.debit_account, t.credit_account, t.amount, }); } } fn pareto_index(random: std.Random, count: usize) usize { assert(count > 0); const hot = @divFloor(count * 2, 10); if (hot == 0) return random.uintLessThan(usize, count); if (random.uintLessThan(u32, 10) < 8) return pareto_index(random, hot); return hot + random.uintLessThan(usize, count - hot); }
If you want to get 90% out of this post, I strongly recommend you to not read any further, and instead copy the above code into your own
main.zig
and try to writedb.zig
yourself. I do think this is the most excellent exercise that can teach you more effectively than any blog post. It’ll take more time, of course, but you’ll get more knowledge per minute spent out of it.If you will settle for the 10%, read on! And, if you want 100% percent, then do your implementation first and then come back here!
[The Table ](https://matklad.github.io/2025/03/19/comptime-zig-
orm.html#The-Table)
We will be building
db.zig
from the gound up. It all will make sense! At the end.Our fundamental data structure is a sorted list of values. Here, I kindly ask the reader to engage suspension of disbelief: in a real database we will be using a data structure with efficient lookups and modifications, such as a B-tree or an LSM tree. For the purposes of this tutorial, we will use a simple sorted array, and will just close our eyes on O(N) insertions and removals.
Values are going to be sorted by a particular field. For example, we sort transfers by their ids. So, when creating a “Table” of transfers, we’ll need to pass the type of key, the type of value, and functions for extracting and comparing keys:
const TransfersTable = TableType(Transfer.ID, Transfer, struct { pub fn key_fn(value: Transfer) Transfer.ID { return value.id; } pub fn key_cmp(lhs: Transfer.ID, rhs: Transfer.ID) std.math.Order { return std.math.order(@intFromEnum(lhs), @intFromEnum(rhs)); } });
Here’s the corresponding declaration:
fn TableType( comptime KeyType: type, comptime ValueType: type, comptime Functions: type, ) type { const key_fn = Functions.key_fn; const key_cmp = Functions.key_cmp; return struct { ... }; }
This is a type constructor function, which takes a bunch of types as arguments and returns a new type. Such functions can only be called at compile time., Zig doesn’t have the ability to create new types at runtime, unlike something like the JVM.
Passing the table of functions as a
Functions
type is a weird idiom of Zig. It would be more natural to use the following signature:fn TableType( comptime KeyType: type, comptime ValueType: type, comptime key_fn: fn(value: ValueType): KeyType, comptime key_cmp: fn(lhs: KeyType, rhs: KeyType): std.math.Order, ) type
But this version is more painful to use at the call site. While
struct
is an expression inZig
, and you can declare one inline,fn
is not an expression, you can’t declare a function inline unless you employ another Zig idiom:const my_function = struct { fn double(x: u32) u32 { return x * 2; } }.double;
Here’s the implementation of the table:
struct { values: std.ArrayListUnmanaged(Value) = .empty, pub const Key = KeyType; pub const Value = ValueType; const Table = @This(); pub fn search( table: *const Table, key: Key, start_index: usize, ) usize { return start_index + std.sort.lowerBound( Value, table.values.items[start_index..], key, compare_fn, ); } fn compare_fn(key: Key, value: Value) std.math.Order { return key_cmp(key, key_fn(value)); } pub fn get(table: *const Table, key: Key) ?Value { const index = table.search(key, 0); if (index >= table.values.items.len) return null; const value = table.values.items[index]; if (key_cmp(key, key_fn(value)) != .eq) return null; return value; } pub fn reserve( table: *Table, gpa: std.mem.Allocator, extra: usize, ) !void { try table.values.ensureUnusedCapacity(gpa, extra); } pub fn insert(table: *Table, value: Value) void { assert(table.values.unusedCapacitySlice().len > 0); const index = table.search(key_fn(value), 0); table.values.insertAssumeCapacity(index, value); } pub fn remove(table: *Table, value: Value) void { const index = table.search(key_fn(value), 0); const removed = table.values.orderedRemove(index); assert(std.meta.eql(value, removed)); } };
The
search
function binary searches for the index corresponding to the givenkey
in the list of values. For convenience of the call-site we are yet to see, we also pass in the starting index for the search. It is useful for, e.g., pagination-style API where you use index as a cursor.The
get
function then usessearch
to find the index, and furthermore checks that we have an exact match.For the
insert
function, we do implement the reservation pattern: memory allocation and data structure modification are split into two functions. Modification proper is infallible, and has a reservation as a precondition.As promised, we do a naive linear memcpy for
insert
/remove
, but we pretend that it is actually logarithmic.Another unrealistic simplification is that our API is scalar — we insert or remove a single item at a time. Both Zig and relational model strongly encorage operating on a batch of objects at a time, pushing the
for
s down:pub fn insert(table: *Table, values: []const Value) void
Even with a naive array list, the batched version runs in O(N + K log K), which is much faster than O(N K) of the scalar version repeated K times. But we leave batching as an exercise for the reader.
[The Indexes ](https://matklad.github.io/2025/03/19/comptime-zig-
orm.html#The-Indexes)
Now comes the relational model part of the tutorial. If we have
const TransfersTable = TableType(Transfer.ID, Transfer, struct { pub fn key_fn(value: Transfer) Transfer.ID { return value.id; } pub fn key_cmp(lhs: Transfer.ID, rhs: Transfer.ID) std.math.Order { return std.math.order(@intFromEnum(lhs), @intFromEnum(rhs)); } });
we can efficiently filter transfers by their ids. How can we add an ability to filter transfers by, say, debit account id?
The idea is to add a second sorted list, which stores pairs of
(Account.ID, Transfer.ID)
With this setup, if you are interested in all transfers from alice, you can binary search for alice’s account ID in the second list, fetch corresponding transfer ids, and then lookup transfers in the first list:
Transfers: id=1 debit_account=alice credit_account=bob id=2 debit_account=charley credit_account=bob id=3 debit_account=alice credit_account=charley Index: debit_account=alice id=1 debit_account=alice id=3 debit_account=charley id=2
What makes this work is that we can maintain two lists in sync. When creating a transfer, you insert it in the Transfers table, but also insert the corresponding pair to the index table. Removal works similarly.
[Implementing the Index ](https://matklad.github.io/2025/03/19/comptime-
zig-orm.html#Implementing-the-Index)
Let’s implement an index table. Here, we start using
comptime
for real. In particular, we will parametrize our index table by the type (such asTransfer
) and the name of the field to build an index over (such as.debit_account
):const TransferDebitAccountIndex = IndexTableType(Transfer, .debit_account);
This is the signature:
fn IndexTableType( comptime Value: type, comptime field: std.meta.FieldEnum(Value), ) type { ... }
This is a type constructor, which takes a type and returns a type. The
std.meta.FieldEnum(Value)
call returns an enum whose variants are fields ofValue
. E.g, ourTransfer
isconst Transfer = struct { id: ID, amount: u128, debit_account: Account.ID, credit_account: Account.ID, };
so the corresponding
FieldEnum
would look like this:const TransferFieldEnum = enum { id, amount, debit_account, credit_account, };
Now, the body:
fn IndexTableType( comptime Value: type, comptime field: std.meta.FieldEnum(Value), ) type { const FieldType = @FieldType(Value, @tagName(field)); const Pair = struct { field: FieldType, id: Value.ID, }; return TableType(Pair, Pair, struct { pub fn key_fn(value: Pair) Pair { return value; } pub fn key_cmp(lhs: Pair, rhs: Pair) std.math.Order { return order_by(Pair, lhs, rhs, &.{ .field, .id }); } }); }
Ultimately, we want to delegate to existing
TableType
, as that already implements the logic for storing a sorted list of items. The item for us is a field type and value id pair. One note about theconst FieldType = @FieldType(Value, @tagName(field));
incantation used.FieldEnum
is the library-level abstraction. Meta programming builtins, like@FieldType
or@field
, work with string names of field (at compile time, of course).@tagName
converts from.debit_account
to"debit_account"
. We don’t have to useFieldEnum
, and could have used strings throughout, butFieldEnum
gives us two advantages:- Earlier type errors: calling
IndexTableType
with a field that doesn’t exist will error out at the call site, rather than at the definition side. - Greppability: in Zig, field access is always spelled as
.debit_account
syntactically, so it is advantageous to stick to the same convention during meta programming, to make sure it also gets into textual searches.
The
Key
for our table is the entirePair
. That is, we want to sort not only on the field value, but on ID as well, to make sure that, when we lookup all ids for a particular field value, we get back a sorted list. That’s why ourkey_fn
is an identity function:pub fn key_fn(value: Pair) Pair { return value; }
In
key_cmp
, we want to compare first byfield
, and then byid
. We can do it manually, but it’s more fun to do some meta programming here as well:pub fn key_cmp(lhs: Pair, rhs: Pair) std.math.Order { return order_by(Pair, lhs, rhs, &.{ .field, .id }); } fn order_by( comptime T: type, lhs: T, rhs: T, comptime fields: []const std.meta.FieldEnum(T), ) std.math.Order { ... }
order_by
is our first mixed-mode function. Some arguments arecomptime
, but some are runtime. This function should compare a pair ofT
by sequentially comparing the values of the corresponding fields, and returning as soon as two unequal fields are found. Here we use aninline for
:fn order_by( comptime T: type, lhs: T, rhs: T, comptime fields: []const std.meta.FieldEnum(T), ) std.math.Order { inline for (fields) |field| { const order = order_enums( @field(lhs, @tagName(field)), @field(rhs, @tagName(field)), ); if (order != .eq) return order; } return .eq; }
Because the list of fields is known at compile time, the loop is fully unrolled, and the actual generated code ends up looking as a sequence of direct comparisons.
@field
fetches a field from a value given acomptime
field’s name.order_enums
is a little helper which allows comparing either numbers or enums:fn order_enums(lhs: anytype, rhs: @TypeOf(lhs)) std.math.Order { return switch (@typeInfo(@TypeOf(lhs))) { .int => std.math.order(lhs, rhs), .@"enum" => std.math.order( @intFromEnum(lhs), @intFromEnum(rhs), ), else => comptime unreachable, }; }
@typeInfo
is a builtin that allows reflecting on the structure of types. In particular, it classifies types as structs, unions, enums, integers, etc, exactly what we need here. One wrinkle here is thatenum
is a keyword, so theenum
variant of@typeInfo
is spelled as@"enum"
. The@""
syntax allows using any string as a Zig identifier, it’s an escape for keywords.And that’s basically it for indexes! Now we have our main table (object table), and a number of index tables. The next task is to bundle them together, so that we can enforce consistency between the tables.
[The Bundle ](https://matklad.github.io/2025/03/19/comptime-zig-
orm.html#The-Bundle)
The next thing we will be building is the
Bundle
. It takes a type, a list of fields to build indexes over, and provides an API for creating and looking up values while maintaining consistency of indexes:pub fn BundleType( comptime Value: type, comptime indexed_fields: []const std.meta.FieldEnum(Value), ) type
With the bundle, we can finally explain what is the database that we’ve started with:
const DB = struct { account: BundleType(Account, &.{}), transfer: BundleType(Transfer, &.{ debit_account, credit_account, }), }
And these were the API we used, and which we’ll now implement:
var db: DB = ...; try db.account.create(gpa, .{ .balance = 100 }); db.account.update(.{ .id = debit_account, .balance = dr.balance - amount }); db.account.get(debit_account); try db.transfer.create(gpa, .{ .debit_account = debit_account, .credit_account = credit_account, .amount = amount, }) db.transfer.filter( .{ .debit_account = alice, .credit_account = bob }, &transfers_buffer, );
Let’s start:
pub fn BundleType( comptime Value: type, comptime indexed_fields: []const std.meta.FieldEnum(Value), ) type { return struct { id_counter: u64 = 0, objects: TableType(Value.ID, Value, struct { pub fn key_fn(value: Value) Value.ID { return value.id; } pub fn key_cmp( lhs: Value.ID, rhs: Value.ID, ) std.math.Order { return std.math.order( @intFromEnum(lhs), @intFromEnum(rhs), ); } }) = .{}, indexes: ???, const Bundle = @This(); pub fn get(bundle: *Bundle, id: Value.ID) ?Value { return bundle.objects.get(id); } ... } }
The basic structure is clear: we have an
id_counter
for assigning new ids when creating values, the object table which stores sorted values and which directly powers theget
method, and then we have the indexes. The indexes are tricky. For transfers, where we indexdebit_account
andcredit_account
, we want theindexes
to look like this:indexes: struct { debit_account: IndexTableType(Transfer, .debit_account), credit_account: IndexTableType(Transfer, .credit_account), }
So we need to iterate over passed in
indexed_fields
and create a struct with a field for each index. And that’s exactly what we’ll do:indexes: blk: { var fields: [indexed_fields.len]std.builtin.Type.StructField = undefined; for (indexed_fields, 0..) |indexed, i| { const Type = IndexTableType(Value, indexed); fields[i] = .{ .name = @tagName(indexed), .type = Type, .default_value_ptr = &(Type{}), .is_comptime = false, .alignment = @alignOf(Type), }; } break :blk @Type(.{ .@"struct" = .{ .layout = .auto, .is_tuple = false, .decls = &.{}, .fields = &fields, } }); } = .{},
This makes much more sense if you read it backwards:
break :blk
“returns” a value from the block labeledblk:
, this is Zig’s more imperative take on “everything is an expression”- We return a
@Type
.@Type
is an inverse of@TypeInfo
, in a sense that, hand waving a bit,@Type(@TypeInfo(T)) == T
. That is, we pass a description of the type, and get a type back. What we want to get is a struct with fields, so we pass@"struct"
and an array of fields. - Each element of
fields
is anstd.builtin.Type.StructField
, a description of a field, that is, its type, name, and default. - The type is
const Type = IndexTableType(Value, indexed);
That is, the index table for theindexed
field ofValue
. - The name matches the name of the index field.
- And the default is just a default for the
Type
. - Finally, we need
indexed_fields.len
fields.
Now that we have all tables,
create
andupdate
are relatively straightforward. Forcreate
, we need to make sure to insert the appropriate values into the objects table and into all of the indexes:pub fn create( bundle: *Bundle, gpa: std.mem.Allocator, value: Value, ) !Value.ID { assert(@intFromEnum(value.id) == 0); try bundle.objects.reserve(gpa, 1); inline for (indexed_fields) |field| { try @field(bundle.indexes, @tagName(field)) .reserve(gpa, 1); } errdefer comptime unreachable; bundle.id_counter += 1; const id: Value.ID = @enumFromInt(bundle.id_counter); var value_with_id = value; value_with_id.id = id; bundle.objects.insert(value_with_id); inline for (indexed_fields) |indexed_field| { const field = @tagName(indexed_field); @field(bundle.indexes, field) .insert(.{ .field = @field(value, field), .id = id }); } return id; }
We start with asserting that the id is 0. It’s our job to assign the id! But, before we do that, we reserve space for one more entry in all the tables. This is the place in the function where we allocate, and where we can fail. The cryptic
errdefer comptime unreachable
is a Zig tongue twister to say that no errors can happen after this point in function. Separating memory reservation and actual modification is helpful to make sure that the data structure remains consistent even in the face of a memory error.Had we not split out fallible
reserve
from infallibleinsert
, and kept the allocation insideinsert
, we could have ended in a situation when a value is inserted only in some of the indexes.I must admit that I am deeply skeptical that it is possible to consistently handle these kind of issues on memory allocation error correctly, I am in the “abort on OOM” camp personally. As a quick quiz, have you noticed when we didn ’t handle this issue correctly in the code we’ve already seen?
With that throat clearing done, the actual logic is straightforward:
- assign id,
- insert the value into the objects table,
- and then, for each of the indexed fields, insert the
(id, field)
pair into the corresponding index tree. Theinline for
loop is guaranteed to be fully unrolled at compile time.
As usual, we use
@field
to get a field by name (c.f. JavaScriptobj.foo
vsobj[foo]
).The
update
is even simple, as we don’t need to allocate new memory. So we just remove old values and insert new ones:pub fn update(bundle: *Bundle, value_new: Value) void { const id = value_new.id; assert(@intFromEnum(id) != 0); const value_old = bundle.get(value_new.id).?; assert(value_old.id == id); bundle.objects.remove(value_old); bundle.objects.insert(value_new); inline for (indexed_fields) |indexed_field| { const field = @tagName(indexed_field); @field(bundle.indexes, field).remove(.{ .field = @field(value_old, field), .id = id, }); @field(bundle.indexes, field).insert(.{ .field = @field(value_new, field), .id = id, }); } }
Although simple, this is the trick that makes the whole relational model work. See how we keep the indexes consistent, by looking up the old value, and removing the corresponding old pairs from indexes.
If that was to easy, don’t worry, we’ll do
filter
next, and that’s the toughest one in the entire exercise :)[Merge Sort Join ](https://matklad.github.io/2025/03/19/comptime-zig-
orm.html#Merge-Sort-Join)
Let’s revisit our original example:
var transfers_buffer: [10]Transfer = undefined; const alice_to_bob_transfers = db.transfer.filter( .{ .debit_account = alice, .credit_account = bob }, &transfers_buffer, ); for (alice_to_bob_transfers) |t| { std.debug.print("alice to bob: from={} to={} amount={}\n", .{ t.debit_account, t.credit_account, t.amount, }); }
The filter takes a filtering condition, equality conditions on a subset of indexed fields. It then fills the output buffer with as much objects as possible that match all conditions. How can we do that effectively?
Recall that our index tables are sorted by the field’s value first, and then by the object’s ID. So, to find all transfers with
.debit_account = alice
, we binary search the debit account index for pairs starting withalice
. Similarly, we can find all transfers with.credit_account = bob
, by binary searching in the other index table.In both indexes, we get some slice of pairs whose first components are
alice
andbob
respectively. Whats more, the second components of the pairs are sorted transfers ids! So, if we want to find ids which match both conditions, we merge two sorted sequences of ids!Now let’s do this for an arbitrary number of indexes:
pub fn filter(bundle: *Bundle, query: anytype, out: []Value) []Value { }
We can’t really type the
query
parameter, so we use Zig’sanytype
here. This is not dynamic typing, rather, it’s monomorphization maximus — every distinct type ofquery
will generate a fresh copy of thefilter
function.First, we reflect on
query
to figure out how many index tables we need to intersect:const fields = comptime std.meta.fieldNames(@TypeOf(query));
Then, we setup a bunch of indexes (the
ijk
kind, not the database index kind). We need an index into the output buffer, and index into the object table, and an index for each index table:var out_index: usize = 0; var object_index: usize = 0; var indexes: [fields.len]usize = @splat(0);
Then, for each index table, we’d want to get a slice of pairs that match the query. Ideally, we’d want to have
fields.len
local variables, one variable per each query field, but Zig doesn’t allow creating local variables via reflection.What we can have though is a single variable, which is a tuple of slices. To create such a thing, we first need to create its type:
const TupleOfSlices = comptime blk: { var components: [fields.len]type = undefined; for (0..fields.len) |i| { const IndexTable = @FieldType( @TypeOf(bundle.indexes), fields[i], ); const Pair = IndexTable.Value; components[i] = []Pair; } break :blk std.meta.Tuple(&components); }; var slices: TupleOfSlices = undefined;
Then, we can use the
search
function to find the actual slices:var slices: TupleOfSlices = undefined; inline for (fields, 0..) |field, i| { const index = @field(bundle.indexes, field).search(.{ .field = @field(query, field), .id = @enumFromInt(0), }, 0); slices[i] = @field(bundle.indexes, field) .values.items[index..]; }
Now, the merge algorithm proper. We are pointing a finger at each of the slices we have and, on each step, advance the fingers that point at the smallest ID. If at some point all our fingers point at the same id, we fetch the corresponding
Value
and add it to the output.Ideally, we’d use a k-way merge mere (
comptime
specialized to a particular k!), but, for simplicity, we’ll use linear search to find the lowest id. We will be iterating until either we run out of space in the output buffer, or we run off one of our slices:outer: while (true) { if (out_index == out.len) break :outer; inline for (0..fields.len) |i| { if (indexes[i] == slices[i].len) { break :outer; } if (slices[i][indexes[i]].field != @field(query, fields[i])) { break :outer; } } ... }
Then, we find the minimum ID:
var id_min = slices[0][indexes[0]].id; inline for (1..slices.len) |i| { const id_next = slices[i][indexes[i]].id if (@intFromEnum(id_next) < @intFromEnum(id_min)) { id_min = id_next; } }
Then, we advance all slices with the minimal ID, counting them:
var advanced_count: u32 = 0; inline for (0..slices.len) |i| { if (slices[i][indexes[i]].id == id_min) { indexes[i] += 1; advanced_count += 1; } }
If we advanced all slices (that is, all ids at a particular position are the same), we lookup the corresponding transfer in the object table and add it to the output:
if (advanced_count == slices.len) { object_index = bundle.objects.search(id_min, object_index); assert(object_index < bundle.objects.values.items.len); const value = bundle.objects.values.items[object_index]; inline for (fields) |field| { assert(@field(value, field) == @field(query, field)); } out[out_index] = value; out_index += 1; object_index += 1; }
This is where we pass a non-zero index to
search
!Altogether:
pub fn filter(bundle: *Bundle, query: anytype, out: []Value) []Value { const fields = comptime std.meta.fieldNames(@TypeOf(query)); var indexes: [fields.len]usize = @splat(0); var out_index: usize = 0; var object_index: usize = 0; const TupleOfSlices = comptime blk: { var components: [fields.len]type = undefined; for (0..fields.len) |i| { const IndexTable = @FieldType( @TypeOf(bundle.indexes), fields[i], ); const Pair = IndexTable.Value; components[i] = []Pair; } break :blk std.meta.Tuple(&components); }; var slices: TupleOfSlices = undefined; inline for (fields, 0..) |field, i| { const index = @field(bundle.indexes, field).search(.{ .field = @field(query, field), .id = @enumFromInt(0), }, 0); slices[i] = @field(bundle.indexes, field) .values.items[index..]; } outer: while (true) { if (out_index == out.len) break :outer; inline for (0..fields.len) |i| { if (indexes[i] == slices[i].len) { break :outer; } if (slices[i][indexes[i]].field != @field(query, fields[i])) { break :outer; } } var id_min = slices[0][indexes[0]].id; inline for (1..slices.len) |i| { const id_next = slices[i][indexes[i]].id; if (@intFromEnum(id_next) < @intFromEnum(id_min)) { id_min = id_next; } } var advanced_count: u32 = 0; inline for (0..slices.len) |i| { if (slices[i][indexes[i]].id == id_min) { indexes[i] += 1; advanced_count += 1; } } if (advanced_count == slices.len) { object_index = bundle.objects.search(id_min, object_index); assert(object_index < bundle.objects.values.items.len); const value = bundle.objects.values.items[object_index]; inline for (fields) |field| { assert(@field(value, field) == @field(query, field)); } out[out_index] = value; out_index += 1; object_index += 1; } } return out[0..out_index]; }
Congratulations! We’ve finished
BundleType
, which means we are past the worst of it, and are almost at the end
We only need code to turn
const DB = DBType(.{ .tables = .{ .account = Account, .transfer = Transfer, }, .indexes = .{ .transfer = .{ .debit_account, .credit_account, }, }, });
into a bunch of bundles with indexes, but at this point this should be trivial:
pub fn DBType(comptime schema: anytype) type { const bundle_names = std.meta.fieldNames(@TypeOf(schema.tables)); var bundles: [bundle_names.len]std.builtin.Type.StructField = undefined; for (bundle_names, 0..) |name, i| { const Value = @field(schema.tables, name); const indexes = if (@hasField(@TypeOf(schema.indexes), name)) @field(schema.indexes, name) else .{}; const Bundle = BundleType(Value, &indexes); bundles[i] = .{ .name = name, .type = Bundle, .default_value_ptr = &Bundle{}, .is_comptime = false, .alignment = @alignOf(Bundle), }; } return @Type(.{ .@"struct" = .{ .layout = .auto, .is_tuple = false, .decls = &.{}, .fields = &bundles, } }); }
Here’s the entire thing,
db.zig
, just under 300 lines:const std = @import("std"); const assert = std.debug.assert; pub fn DBType(comptime schema: anytype) type { const bundle_names = std.meta.fieldNames(@TypeOf(schema.tables)); var bundles: [bundle_names.len]std.builtin.Type.StructField = undefined; for (bundle_names, 0..) |name, i| { const Value = @field(schema.tables, name); const indexes = if (@hasField(@TypeOf(schema.indexes), name)) @field(schema.indexes, name) else .{}; const Bundle = BundleType(Value, &indexes); bundles[i] = .{ .name = name, .type = Bundle, .default_value_ptr = &Bundle{}, .is_comptime = false, .alignment = @alignOf(Bundle), }; } return @Type(.{ .@"struct" = .{ .layout = .auto, .is_tuple = false, .decls = &.{}, .fields = &bundles, } }); } pub fn BundleType( comptime Value: type, comptime indexed_fields: []const std.meta.FieldEnum(Value), ) type { return struct { id_counter: u64 = 0, objects: TableType(Value.ID, Value, struct { pub fn key_fn(value: Value) Value.ID { return value.id; } pub fn key_cmp(lhs: Value.ID, rhs: Value.ID) std.math.Order { return std.math.order(@intFromEnum(lhs), @intFromEnum(rhs)); } }) = .{}, indexes: blk: { var fields: [indexed_fields.len]std.builtin.Type.StructField = undefined; for (indexed_fields, 0..) |indexed, i| { const Type = IndexTableType(Value, indexed); fields[i] = .{ .name = @tagName(indexed), .type = Type, .default_value_ptr = &(Type{}), .is_comptime = false, .alignment = @alignOf(Type), }; } break :blk @Type(.{ .@"struct" = .{ .layout = .auto, .is_tuple = false, .decls = &.{}, .fields = &fields, } }); } = .{}, const Bundle = @This(); pub fn get(bundle: *Bundle, id: Value.ID) ?Value { return bundle.objects.get(id); } pub fn create( bundle: *Bundle, gpa: std.mem.Allocator, value: Value, ) !Value.ID { assert(@intFromEnum(value.id) == 0); try bundle.objects.reserve(gpa, 1); inline for (indexed_fields) |field| { try @field(bundle.indexes, @tagName(field)) .reserve(gpa, 1); } errdefer comptime unreachable; bundle.id_counter += 1; const id: Value.ID = @enumFromInt(bundle.id_counter); var value_with_id = value; value_with_id.id = id; bundle.objects.insert(value_with_id); inline for (indexed_fields) |indexed_field| { const field = @tagName(indexed_field); @field(bundle.indexes, field) .insert(.{ .field = @field(value, field), .id = id }); } return id; } pub fn update(bundle: *Bundle, value_new: Value) void { const id = value_new.id; assert(@intFromEnum(id) != 0); const value_old = bundle.get(value_new.id).?; assert(value_old.id == id); bundle.objects.remove(value_old); bundle.objects.insert(value_new); inline for (indexed_fields) |indexed_field| { const field = @tagName(indexed_field); @field(bundle.indexes, field) .remove(.{ .field = @field(value_old, field), .id = id }); @field(bundle.indexes, field) .insert(.{ .field = @field(value_new, field), .id = id }); } } pub fn filter(bundle: *Bundle, query: anytype, out: []Value) []Value { const fields = comptime std.meta.fieldNames(@TypeOf(query)); var indexes: [fields.len]usize = @splat(0); var out_index: usize = 0; var object_index: usize = 0; const TupleOfSlices = comptime blk: { var components: [fields.len]type = undefined; for (0..fields.len) |i| { const IndexTable = @FieldType( @TypeOf(bundle.indexes), fields[i], ); const Pair = IndexTable.Value; components[i] = []Pair; } break :blk std.meta.Tuple(&components); }; var slices: TupleOfSlices = undefined; inline for (fields, 0..) |field, i| { const index = @field(bundle.indexes, field).search(.{ .field = @field(query, field), .id = @enumFromInt(0), }, 0); slices[i] = @field(bundle.indexes, field) .values.items[index..]; } outer: while (true) { if (out_index == out.len) break :outer; inline for (0..fields.len) |i| { if (indexes[i] == slices[i].len) break :outer; if (slices[i][indexes[i]].field != @field(query, fields[i])) break :outer; } var id_min = slices[0][indexes[0]].id; inline for (1..slices.len) |i| { if (@intFromEnum(slices[i][indexes[i]].id) < @intFromEnum(id_min)) { id_min = slices[i][indexes[i]].id; } } var advanced_count: u32 = 0; inline for (0..slices.len) |i| { if (slices[i][indexes[i]].id == id_min) { indexes[i] += 1; advanced_count += 1; } } if (advanced_count == slices.len) { object_index = bundle.objects.search(id_min, object_index); assert(object_index < bundle.objects.values.items.len); const value = bundle.objects.values.items[object_index]; inline for (fields) |field| { assert(@field(value, field) == @field(query, field)); } out[out_index] = value; out_index += 1; object_index += 1; } } return out[0..out_index]; } }; } fn IndexTableType(comptime Value: type, comptime field: std.meta.FieldEnum(Value)) type { const FieldType = @FieldType(Value, @tagName(field)); const Pair = struct { field: FieldType, id: Value.ID, }; return TableType(Pair, Pair, struct { pub fn key_fn(value: Pair) Pair { return value; } pub fn key_cmp(lhs: Pair, rhs: Pair) std.math.Order { return order_by(Pair, lhs, rhs, &.{ .field, .id }); } }); } fn order_by( comptime T: type, lhs: T, rhs: T, comptime fields: []const std.meta.FieldEnum(T), ) std.math.Order { inline for (fields) |field| { const order = order_enums( @field(lhs, @tagName(field)), @field(rhs, @tagName(field)), ); if (order != .eq) return order; } return .eq; } fn order_enums(lhs: anytype, rhs: @TypeOf(lhs)) std.math.Order { return switch (@typeInfo(@TypeOf(lhs))) { .int => std.math.order(lhs, rhs), .@"enum" => std.math.order( @intFromEnum(lhs), @intFromEnum(rhs), ), else => comptime unreachable, }; } fn TableType( comptime KeyType: type, comptime ValueType: type, comptime Functions: type, ) type { const key_fn = Functions.key_fn; const key_cmp = Functions.key_cmp; return struct { values: std.ArrayListUnmanaged(Value) = .empty, pub const Key = KeyType; pub const Value = ValueType; const Table = @This(); pub fn get(table: *const Table, key: Key) ?Value { const index = table.search(key, 0); if (index >= table.values.items.len) return null; const value = table.values.items[index]; if (key_cmp(key, key_fn(value)) != .eq) return null; return value; } pub fn reserve(table: *Table, gpa: std.mem.Allocator, extra: usize) !void { try table.values.ensureUnusedCapacity(gpa, extra); } pub fn insert(table: *Table, value: Value) void { assert(table.values.unusedCapacitySlice().len > 0); const index = table.search(key_fn(value), 0); table.values.insertAssumeCapacity(index, value); } pub fn remove(table: *Table, value: Value) void { const index = table.search(key_fn(value), 0); const removed = table.values.orderedRemove(index); assert(std.meta.eql(value, removed)); } pub fn search(table: *const Table, key: Key, start_index: usize) usize { return start_index + std.sort.lowerBound( Value, table.values.items[start_index..], key, compare_fn, ); } fn compare_fn(key: Key, value: Value) std.math.Order { return key_cmp(key, key_fn(value)); } }; }
[Post Script ](https://matklad.github.io/2025/03/19/comptime-zig-
orm.html#Post-Script)
Note that, while the exercise is useful, it deliberately focuses narrowly on a single aspect of Zig — comptime reflection. You should avoid this feature if possible: hopefully, I have successfully convinced you that they can lead to somewhat mind-bending code. The topic of Zig in general is much larger, and I highly recommend the following resources, in this order:
-
- March 18, 2025
-
🔗 MetaBrainz Schema change release: May 19, 2025 rss
MusicBrainz is announcing a new database schema change release set for May 19, 2025. Like most of our recent schema changes, it should have little or no impact to downstream users.
There is one change to a major replicated table worth mentioning upfront: the
medium
table will have a newgid
column added. If you're running custom SQL queries against the database that join themedium
table at all, there is a small chance you could run into errors likeERROR: column reference "gid" is ambiguous
if you're not properly qualifying the columns being selected.We're also altering some columns on the
artist_release
andartist_release_group
tables (see below for more details). These are materialized tables used by our website on the back-end to speed up certain pages; you should normally not be accessing them directly, but it's worth mentioning just in case. These tables do exist on mirrors, but are only populated with data if you've runadmin/BuildMaterializedTables
before.Besides introducing some new tables for storing medium attributes and replacing some functions/triggers, you generally shouldn't have to worry about any other breaking changes in this release.
Finally, here is the complete list of scheduled tickets:
Database schema
The following tickets change the database schema in some way.
- MBS-9253: List EP release groups above singles on artist pages. A small change to the
get_artist_release_group_rows
function is required in order to be able to change the sorting of release groups to prioritize EPs over singles. The function will be changed to depend on the type'schild_order
(which can be safely changed at any time) rather than itsid
for sorting. While this function exists on mirrors, the function change shouldn't have any impact on them directly (but a change of thechild_order
of the types will affect the sorting for display on mirrors as well). We'll be adding new triggers to therelease_group_primary_type
andrelease_group_secondary_type
tables to run the function when the tables change - these triggers will also exist on mirrors. - MBS-13322: Race condition when removing unused URLs. A rare internal error can occur in one of our trigger functions that cleans up unused URLs. We'll replace that function,
delete_unused_url
, updating it to avoid a "race condition" whereby a URL can become used again the moment before it's deleted. This will have no impact on mirrors, asdelete_unused_url
is only invoked by triggers that don't exist on mirrors. - MBS-13464: Inconsistent sorting of artist release/release group titles. In the May 2021 schema change, we added some new materialized tables to significantly speed up the loading of artists' release and release group listings: the not-so-surprisingly named
artist_release
andartist_release_group
tables. These work by efficiently indexing an artist's releases and release groups by date and other attributes, and then finally by their titles. Except for efficiency reasons, we originally decided to only store the first character of the titles for sorting. That predictably leads to incorrect sorting in certain cases, like with undated live bootlegs, as shown in MBS-13464. After measuring the actual size impact, we've decided to update theartist_release
andartist_release_group
tables to replace theirsort_character
columns withname
columns that store the complete titles. - MBS-13768 - Add MBIDs to mediums. Adds a
gid
column to themedium
table, and a newmedium_gid_redirect
table. It generates MBIDs for existing mediums that will be replicated to mirrors. - MBS-13832: Also support PDF files in CAA / EAA
index_listing
(foris_front
purposes). PDF files are never treated asfront
for cover art archive purposes, probably because they originally did not have PNG thumbnails generated by the Internet Archive. That changed quite a while ago though, and there seems to be no reason to single them out anymore. We will just replace theindex_listing
views forcover_art_archive
andevent_art_archive
with ones amended to not filter out PDF files. - MBS-13964: Some recordings are missing a first release date. A bug was discovered that causes recordings to sometimes have incorrect first-release-date values if any of the releases they're attached to are merged with the "append" strategy. We'll be adding a new trigger to the
medium
table that updatesrecording_first_release_date
properly when such merges occur. Note that sincerecording_first_release_date
is a materialized table, this trigger will also run on mirrors; that way it's kept up-to-date even after runningadmin/BuildMaterializedTables
initially. - MBS-13965: Extend entity attribute schema to mediums. We will add the same tables for mediums as we already have for other entities that can potentially support entity attributes:
medium_attribute_type
,medium_attribute_type_allowed_value
andmedium_attribute
. This will eventually allow us to support medium-level attributes such as per-medium catalog numbers and barcodes, colors for vinyl, etc. We will not be implementing the feature fully yet - this is just the schema change required to be able to implement it at a point of our choice in the future, at least not before the release editor has been migrated to React. As such, the new tables will be added in mirrors but will be empty for quite a while.
Data corrections
- MBS-13966: Release group first release dates need to be recalculated. Another (unrelated) issue with "first release date" information, but this time with release groups rather than recordings. We've found that a small percentage of release groups' first release dates (as stored in the
release_group_meta
table and returned in the web service) is wrong. We won't be making any schema changes to address this, but will run a script to rebuild the incorrect data.
Search indexes
Data corrections to the
recording_first_release_date
andrelease_group_meta
tables do affect indexed recording and release group data respectively. If you have live search indexing enabled, those changes should be propagated to the search indexes automatically. Otherwise, you will have to perform a full reindex of those entities' search indexes.We’ll post upgrade instructions for standalone/mirror servers on the day of the release. If you have any questions, feel free to comment below or on the relevant above-linked tickets.
- MBS-9253: List EP release groups above singles on artist pages. A small change to the
-
🔗 astral-sh/uv 0.6.8 release
Release Notes
Enhancements
- Add support for enabling all groups by default with
default-groups = "all"
(#12289) - Add simpler
--managed-python
and--no-managed-python
flags for toggling Python preferences (#12246)
Performance
- Avoid allocations for default cache keys (#12063)
Bug fixes
- Allow local version mismatches when validating lockfile (#12285)
- Allow owned string when deserializing
requires-python
(#12278) - Make cache errors non-fatal in
Planner::build
(#12281)
uv 0.6.8
Install uv 0.6.8
Install prebuilt binaries via shell script
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/astral-sh/uv/releases/download/0.6.8/uv-installer.sh | sh
Install prebuilt binaries via powershell script
powershell -ExecutionPolicy Bypass -c "irm https://github.com/astral-sh/uv/releases/download/0.6.8/uv-installer.ps1 | iex"
Download uv 0.6.8
File | Platform | Checksum
---|---|---
uv-aarch64-apple-darwin.tar.gz | Apple Silicon macOS | checksum
uv-x86_64-apple-darwin.tar.gz | Intel macOS | checksum
uv-aarch64-pc-windows-msvc.zip | ARM64 Windows | checksum
uv-i686-pc-windows-msvc.zip | x86 Windows | checksum
uv-x86_64-pc-windows-msvc.zip | x64 Windows | checksum
uv-aarch64-unknown-linux-gnu.tar.gz | ARM64 Linux | checksum
uv-i686-unknown-linux-gnu.tar.gz | x86 Linux | checksum
uv-powerpc64-unknown-linux-gnu.tar.gz | PPC64 Linux | checksum
uv-powerpc64le-unknown-linux-gnu.tar.gz | PPC64LE Linux | checksum
uv-s390x-unknown-linux-gnu.tar.gz | S390x Linux | checksum
uv-x86_64-unknown-linux-gnu.tar.gz | x64 Linux | checksum
uv-armv7-unknown-linux-gnueabihf.tar.gz | ARMv7 Linux | checksum
uv-aarch64-unknown-linux-musl.tar.gz | ARM64 MUSL Linux | checksum
uv-i686-unknown-linux-musl.tar.gz | x86 MUSL Linux | checksum
uv-x86_64-unknown-linux-musl.tar.gz | x64 MUSL Linux | checksum
uv-arm-unknown-linux-musleabihf.tar.gz | ARMv6 MUSL Linux (Hardfloat) | checksum
uv-armv7-unknown-linux-musleabihf.tar.gz | ARMv7 MUSL Linux | checksumuv-build 0.6.8
Install uv-build 0.6.8
Install prebuilt binaries via shell script
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/astral-sh/uv/releases/download/0.6.8/uv-build-installer.sh | sh
Install prebuilt binaries via powershell script
powershell -ExecutionPolicy Bypass -c "irm https://github.com/astral-sh/uv/releases/download/0.6.8/uv-build-installer.ps1 | iex"
Download uv-build 0.6.8
File | Platform | Checksum
---|---|---
uv-build-aarch64-apple-darwin.tar.gz | Apple Silicon macOS | checksum
uv-build-x86_64-apple-darwin.tar.gz | Intel macOS | checksum
uv-build-aarch64-pc-windows-msvc.zip | ARM64 Windows | checksum
uv-build-i686-pc-windows-msvc.zip | x86 Windows | checksum
uv-build-x86_64-pc-windows-msvc.zip | x64 Windows | checksum
uv-build-aarch64-unknown-linux-gnu.tar.gz | ARM64 Linux | checksum
uv-build-i686-unknown-linux-gnu.tar.gz | x86 Linux | checksum
uv-build-powerpc64-unknown-linux-gnu.tar.gz | PPC64 Linux | checksum
uv-build-powerpc64le-unknown-linux-gnu.tar.gz | PPC64LE Linux | checksum
uv-build-s390x-unknown-linux-gnu.tar.gz | S390x Linux | checksum
uv-build-x86_64-unknown-linux-gnu.tar.gz | x64 Linux | checksum
uv-build-armv7-unknown-linux-gnueabihf.tar.gz | ARMv7 Linux | checksum
uv-build-aarch64-unknown-linux-musl.tar.gz | ARM64 MUSL Linux | checksum
uv-build-i686-unknown-linux-musl.tar.gz | x86 MUSL Linux | checksum
uv-build-x86_64-unknown-linux-musl.tar.gz | x64 MUSL Linux | checksum
uv-build-arm-unknown-linux-musleabihf.tar.gz | ARMv6 MUSL Linux (Hardfloat) | checksum
uv-build-armv7-unknown-linux-musleabihf.tar.gz | ARMv7 MUSL Linux | checksum - Add support for enabling all groups by default with
-
🔗 mhx/dwarfs dwarfs-0.10.1 release
Bugfixes
- Allow building
utils_test
against a non-compatible, system-installed version of gtest. This is a common issue when trying to integrate dwarfs into a package manager, as these generally disallow fetching external dependencies at build time. dwarfsck
was always reporting a block size of 1 byte rather than the actual block size of the image.DWARFS_HAVE_LIBBROTLI
was not set correctly in the config file, causing build errors if the library was built withoutbrotli
.- Several small fixes for building with Homebrew.
Full Changelog :
v0.10.0...v0.10.1
SHA-256 Checksums
53bb766f3a22f019c4bac7cbf995376cb4f3f0ad69e4793471af11c954185227 dwarfs-0.10.1-Linux-aarch64-clang-stacktrace.tar.xz f272f667649d71ec7d29d6822ad4198e13a33e997d722f74f2bca23b239de72f dwarfs-0.10.1-Linux-aarch64-clang.tar.xz 671ce264938ab4cacc8af0aabcacb1ecfffa01284b4959441e921264ae19b47e dwarfs-0.10.1-Linux-x86_64-clang-stacktrace.tar.xz 84894bf6a26cac2eb2c8d43d6fccf1ece7665c4c15050cec494d09199bd8310e dwarfs-0.10.1-Linux-x86_64-clang.tar.xz 4041ed9aa19e03f44dbe69b470f31423a3c358bcd07e78230311b859629785b6 dwarfs-0.10.1-Windows-AMD64.7z db785e0e0f257fa4363d90153db34127add4552791a72998b30ded787840d039 dwarfs-0.10.1.tar.xz 3e003c9a5fbf31b75548c11a2c2c1958f606ce2c2022db4baa6d62b80201c76d dwarfs-universal-0.10.1-Linux-aarch64-clang 44ad0a3f2d89e373b0279d1db7c19aeca46879972a2db226e31ec7ebe8ff103e dwarfs-universal-0.10.1-Linux-aarch64-clang-stacktrace 18f99297c7425bd1bea87d47a2046bfc7e00fe9cc566f024e631ed39a6bb1913 dwarfs-universal-0.10.1-Linux-x86_64-clang c60821be4a248be2feb54905b5bb6c5cd323014bcb7107f0d586ba7f70deb488 dwarfs-universal-0.10.1-Linux-x86_64-clang-stacktrace 768d013d55cd030c1fbabd35ad522648581c79435da4994cc39de75b3a7eda30 dwarfs-universal-0.10.1-Windows-AMD64.exe
- Allow building
-
🔗 mhx/dwarfs dwarfs-0.10.2 release
Bugfixes
- Gracefully handle localized error message on Windows. These error messages can contain characters from a Windows (non-UTF-8) code page, which could cause a fatal error in
fmt::print
in the logging code. Call sites that log such error messages now try to convert these from the code page to UTF-8 or, if that fails, simply replace all characters that are invalid from a UTF-8 point-of-view. Partial fix for #241. - Handle invalid wide chars in file names on Windows. For some reason, Windows allows invalid UTF-16 characters in file names. Try to handle these gracefully when converting to UTF-8. Partial fix for #241.
- Workaround for new boost versions which have a
process
component. - Workaround for a deprecated boost header.
- Support for upcoming Boost 1.87.0.
io_service
was deprecated and replaced byio_context
in 1.66.0. The upcoming Boost 1.87.0 will remove the deprecated API. (Thanks to Michael Cho for the fix.) - Disable extended output algorithms (
shake(128|256)
). - Install libraries to
CMAKE_INSTALL_LIBDIR
. Fixes #240. - mode/uid/gid checks were expecting 16-bit types.
- stricter metadata checks and improved error messages.
- Various fixes for
filesystem_extractor
to prevent memory leaks, correctly handle errors during extraction, and prevent creation of invalid archive outputs due to padding. - Various minor fixes: non-virtual dtors, missing includes,
std::move
vs.std::forward
, unused code removal.
Other
- More test cases for stricter metadata checks. Also enable the strict checks in in unit tests by default.
- Fix typos in
README.md
. (Thanks to Christian Clauss for the fix.) - Fix typos in man pages.
New Contributors
Full Changelog :
v0.10.1...v0.10.2
SHA-256 Checksums
2f4d275d006228acb2280c3bf5327f02098c2ba110d81fe3854a80f5fd848058 dwarfs-0.10.2-Linux-aarch64-clang-reldbg-stacktrace.tar.xz 75878252ef0bfc490e5bd6ad5870bc5a02531650ceacf1258807e09606069561 dwarfs-0.10.2-Linux-aarch64-clang.tar.xz 74b52460ebd2d8e752ad7fbe976c683be542a8a581fdf25ac59ba1dea5bc5d0c dwarfs-0.10.2-Linux-x86_64-clang-reldbg-stacktrace.tar.xz a018bfe2531763a273a2d78bc507b1c89fe58a44f7955c980c854a55f9adbaea dwarfs-0.10.2-Linux-x86_64-clang.tar.xz 36767290a39f92782e41daaa3eb45e39550ad1a4294a6d8365bc0f456f75f00c dwarfs-0.10.2.tar.xz c15280d920b67b51b42117612bd8a959eb5ca9ed0202fd765e19743aad61a728 dwarfs-0.10.2-Windows-AMD64.7z 36f72f1ff049a1d955e68547540b932539beab44b0cba23efbdb7a1b0bfd32d4 dwarfs-universal-0.10.2-Linux-aarch64-clang 4d55e783e352a5aafc321f7ac36964b0493601320d3d93d021634e78e743505d dwarfs-universal-0.10.2-Linux-aarch64-clang-reldbg-stacktrace b565399a0a671d06be3e078376e02b388ee14133680b8d19483fc93c294b12d2 dwarfs-universal-0.10.2-Linux-x86_64-clang cb374fc2d64bbf3bd4dd4714f1be37e3d6fc6ecffc7afd93714b6897e9d3751a dwarfs-universal-0.10.2-Linux-x86_64-clang-reldbg-stacktrace eb69b1bf4703d28bd3d5f477dca1ab3460dda4250c7ce1899eb4192c2c1bef69 dwarfs-universal-0.10.2-Windows-AMD64.exe
- Gracefully handle localized error message on Windows. These error messages can contain characters from a Windows (non-UTF-8) code page, which could cause a fatal error in
-
🔗 mhx/dwarfs dwarfs-0.11.0 release
Bugfixes
- Remove the
access
implementation from the FUSE driver. There's no point here trying to be more clever than FUSE's default. This makes sure DwarFS will behave more like other FUSE file systems. See github discussion #244 for details. - Limit the number of chunks returned in
inodeinfo
xattr. Highly fragmented files would have megabytes ininodeinfo
, which not only breaks the xattr interface, but can also dramatically slow down tools likeeza
who like to read xattrs for no apparent reason. - Avoid nested indentation in manpages to work around
ronn-ng
bug. Fixes github #249. - Don't link library against
jemalloc
. This fixes both issues withpydwarfs
and issues building withjemalloc
support on macOS. Only the binaries are now linked againstjemalloc
, which should be sufficient.
Features
- Support case-insensitive lookups. Fixes github #232.
- Allow setting image size in FUSE driver. Fixes github #239.
- Support extracting a subset of files with
dwarfsextract
using the new--pattern
option. The same glob patterns can be used as for the filter rules inmkdwarfs
. Fixes github #243. - Allow overriding UID / GID for the whole file system when using the FUSE driver on non-Windows platforms. See github discussion #244.
- Expose more LZMA options (
mode
,mf
,nice
,depth
). - Improve filter patterns, which now support ranges and complementation.
- Improve speed of filesystem
walk
/walk_data_order
calls by 80% / 40%. The impact of this will largely depend on what the code is being run for each inode, but, for example, the speed of listing more than 14 million files withdwarfsck
will take about 16 seconds compared to 17 seconds with the previous release. - Added an inode size cache to the metadata to speed up file size computation for large, highly fragmented files. The configuration is currently fixed using a conservative default. Only files with at least 128 chunks will be added to the cache, so in a lot of cases this cache may be completely empty and not contribute to the size of the file system image at all.
- Use bit-packing for hardlink, shared files, and chunk tables. This will consume less memory when loading a DwarFS image.
- Show total hardlink size in
dwarfsck
output. - Library: return a
dir_entry_view
fromreaddir
andfind
. This is more consistent, but was previously not easily possible due to the lack of a "self" dir entry in the metadata. The "self" entry has been added and will only impact the size of the metadata ifdirectories
metadata is not packed. - Library: prefer
std::string_view
overchar const*
. - Library: add directory iterator to
directory_view
. - Library: support for
maxiov
parameter inreadv
call.
Other
- Lots of internal refactoring to improve overall code quality.
Full Changelog :
v0.10.2...v0.11.0
SHA-256 Checksums
2040a951697ddb78a4b6ee887e06be4295f9d2e3708a311ac72ad9ad7bd28aa3 dwarfs-0.11.0-Linux-aarch64-clang-reldbg-stacktrace.tar.xz 0db0d6bc823d26f86d47f1edf8b4ddbcf85fab24e118be7de9ee091234b5623e dwarfs-0.11.0-Linux-aarch64-clang.tar.xz a7214b10902653c582aa4c21e05e2476518ed1d15e4640cc3eb2bbe53a297120 dwarfs-0.11.0-Linux-x86_64-clang-reldbg-stacktrace.tar.xz 35e851bce5ba6a17b6b53081d1305ebcee5698d8bc770b8b1a875d2986fd6d7c dwarfs-0.11.0-Linux-x86_64-clang.tar.xz 852c96133444493eff6f03324bc2700e31859d75410a937f0714eae9f75d2dd4 dwarfs-0.11.0.tar.xz 15591223010400488c5066a864bcee3ad71c045e2aa4bf60b7c05e9d45909b9f dwarfs-0.11.0-Windows-AMD64.7z da197d19b3eadfea5180034765d70c050ae9b85ade58dd0aa91b65283a079236 dwarfs-universal-0.11.0-Linux-aarch64-clang d58ad14583345d4e7efb4ddb0278ec39c836646a39868422ca1358fa22a990b7 dwarfs-universal-0.11.0-Linux-aarch64-clang-reldbg-stacktrace 72fe171dd9d9abd0bba46e52a983934affbcc9a7349d07854eda91d788ea686b dwarfs-universal-0.11.0-Linux-x86_64-clang 1c5b19c21aca4dc6df8cff3e06358c96fb4e3bb1e969ed3ceef0eb381d84f98b dwarfs-universal-0.11.0-Linux-x86_64-clang-reldbg-stacktrace f2451ed0832c13157f869a3d7ba3596fcb4bb0c5c55741fc054ce6b1bdc977c8 dwarfs-universal-0.11.0-Windows-AMD64.exe
- Remove the
-
🔗 The Pragmatic Engineer Survey: What’s in your tech stack? rss
We want to capture an accurate snapshot of software engineering, today - and need your help! Tell us about your tech stack and get early access to the final report, plus extra analysis
We'd like to know what tools, languages, frameworks and platforms you are using today. Which tools/frameworks/languages are popular and why? Which ones do engineers love and dislike the most at this moment in time?
With more than 950,000 tech professionals subscribed to this newsletter, we have a unique opportunity to take the industry's pulse by finding out which tech stacks are typical - and which ones are less common.
So, we want to build a realistic picture of this - and share the findings in a special edition devoted to this big topic. But it's only possible with input from you.
We 're asking for your help to answer the question: what's in your tech stack? To help, please fill out this survey all about it. Doing so should only take between 5-15 minutes, covering the platform(s) you work on, the tooling you use, the custom tools you have built, and related topics.
The results will be published in a future edition of The Pragmatic Engineer. If you take part and fill out the survey, you will receive the full results early, plus some extra, exclusive analysis from myself and Elin.
This is the first time we 're running a survey that's so ambitious - and we very much appreciate your help. Previous research we did included a reality check on AI tooling and what GenZ software engineers really think. This survey is even more ambitious - and the results should reveal people's typical and atypical tooling choices, across the tech industry. You may even get inspiration for new and different tools, languages, and approaches to try out.
We plan to publish the findings in May.
-
🔗 News Minimalist Israel strikes Gaza, ending ceasefire + 3 more stories rss
Today ChatGPT read 18446 top news stories. After removing previously covered events, there are 4 articles with a significance score over 5.9.
[6.4] Israel resumes attacks in Gaza, ending ceasefire —sandiegouniontribune.com
Israeli airstrikes across the Gaza Strip killed at least 404 Palestinians, breaking a ceasefire in place since January. This escalation threatens to fully reignite the war that has been ongoing for 17 months, with Prime Minister Netanyahu stating the military operation is open-ended.
Civilian casualties included women and children, and the attacks prompted evacuations in eastern Gaza. The Israeli military plans to expand operations beyond airstrikes, as tensions rose amid stalled negotiations for a second phase of the ceasefire aimed at releasing hostages.
The strikes were met with mass protests in Israel, criticizing Netanyahu's leadership during the hostage crisis. Local health officials report over 48,000 Palestinians have died since the conflict began, making this one of the deadliest days of the war.
[6.1] Webb telescope directly observes CO2 on exoplanets for the first time —dawn.com
The James Webb Space Telescope has directly observed carbon dioxide in planets outside our solar system for the first time. This was achieved in the HR 8799 system, which is 130 light years away and only 30 million years old.
Researchers used Webb’s coronagraph instruments to view the planets, marking a departure from the usual method of detecting exoplanets when they cross in front of their host star. This new approach allowed scientists to see the light emitted directly from the planets, providing new insights into their atmospheres and formation processes.
[6.1] Germany plans nearly one trillion euros in new debt —dw.com
Germany plans to vote on a bill that would allow it to take on nearly one trillion euros in new debt for military and infrastructure investments. This requires a constitutional change and is unprecedented in the Bundestag's history.
The proposed legislation would ease the country's strict debt limits, allowing both the federal government and states to borrow more. It includes provisions for military spending, infrastructure upgrades, and climate protection, with a total of €500 billion allocated over the next twelve years.
Critics, including the far-right Alternative for Germany and the Left Party, oppose the debt package. Economists warn that this could significantly increase Germany's national debt and impact financial stability in Europe, particularly for already indebted countries.
[6.0] Global trade reaches a record $33 trillion, driven by services —unctad.org
Global trade reached a record $33 trillion in 2024, increasing by 3.7% or $1.2 trillion, according to UNCTAD. Growth was primarily driven by services, which rose 9%.
Developing economies outperformed developed nations, with trade rising 4% overall. East and South Asia led this growth, while trade in Russia, South Africa, and Brazil remained sluggish. Developed economies saw flat trade for the year.
Though trade started stable in early 2025, increasing geoeconomic tensions and policy shifts suggest potential disruptions. Shipping indexes showing reduced demand indicate businesses are adjusting to the changing landscape.
Highly covered news with significance over 5.5
[5.5] US withdraws from Ukraine war crimes investigation center
(theguardian.com + 5)[5.5] US intensifies airstrikes against Yemen's Houthi rebels
(apnews.com + 151)[5.5] Syria attends Brussels donor conference for the first time
(news.yahoo.com + 12)Thanks for reading!
Get access to 2x times more stories in high-significance range (5+) with News Minimalist Premium.
— Vadim
-
🔗 astral-sh/uv 0.6.7 release
Release Notes
If encountering inconsistent wheel version errors, see#12254.
Python
- Add CPython 3.14.0a6
- Fix regression where extension modules would use wrong
CXX
compiler on Linux - Enable FTS3 enhanced query syntax for SQLite
See the
python-build-standalone
release notes for more details.Enhancements
- Add support for
-c
constraints inuv add
(#12209) - Add support for
--global
default version inuv python pin
(#12115) - Always reinstall local source trees passed to
uv pip install
(#12176) - Render token claims on publish permission error (#12135)
- Add pip-compatible
--group
flag touv pip install
anduv pip compile
(#11686)
Preview features
- Avoid creating duplicate directory entries in built wheels (#12206)
- Allow overriding module names for editable builds (#12137)
Performance
- Avoid replicating core-metadata field on
File
struct (#12159)
Bug fixes
- Add
src
to default cache keys (#12062) - Discard insufficient fork markers (#10682)
- Ensure
python pin --global
creates parent directories if missing (#12180) - Fix GraalPy abi tag parsing and discovery (#12154)
- Remove extraneous script packages in
uv sync --script
(#12158) - Remove redundant
activate.bat
output (#12160) - Avoid subsequent index hint when no versions are available on the first index (#9332)
- Error on lockfiles with incoherent wheel versions (#12235)
Rust API
- Update
BaseClientBuild
to accept custom proxies (#12232)
Documentation
- Make testpypi index explicit in example snippet (#12148)
- Reverse and format the archived changelogs (#12099)
- Use consistent commas around i.e. and e.g. (#12157)
- Fix typos in MRE docs (#12198)
- Fix double space typo (#12171)
uv 0.6.7
Install uv 0.6.7
Install prebuilt binaries via shell script
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/astral-sh/uv/releases/download/0.6.7/uv-installer.sh | sh
Install prebuilt binaries via powershell script
powershell -ExecutionPolicy Bypass -c "irm https://github.com/astral-sh/uv/releases/download/0.6.7/uv-installer.ps1 | iex"
Download uv 0.6.7
File | Platform | Checksum
---|---|---
uv-aarch64-apple-darwin.tar.gz | Apple Silicon macOS | checksum
uv-x86_64-apple-darwin.tar.gz | Intel macOS | checksum
uv-aarch64-pc-windows-msvc.zip | ARM64 Windows | checksum
uv-i686-pc-windows-msvc.zip | x86 Windows | checksum
uv-x86_64-pc-windows-msvc.zip | x64 Windows | checksum
uv-aarch64-unknown-linux-gnu.tar.gz | ARM64 Linux | checksum
uv-i686-unknown-linux-gnu.tar.gz | x86 Linux | checksum
uv-powerpc64-unknown-linux-gnu.tar.gz | PPC64 Linux | checksum
uv-powerpc64le-unknown-linux-gnu.tar.gz | PPC64LE Linux | checksum
uv-s390x-unknown-linux-gnu.tar.gz | S390x Linux | checksum
uv-x86_64-unknown-linux-gnu.tar.gz | x64 Linux | checksum
uv-armv7-unknown-linux-gnueabihf.tar.gz | ARMv7 Linux | checksum
uv-aarch64-unknown-linux-musl.tar.gz | ARM64 MUSL Linux | checksum
uv-i686-unknown-linux-musl.tar.gz | x86 MUSL Linux | checksum
uv-x86_64-unknown-linux-musl.tar.gz | x64 MUSL Linux | checksum
uv-arm-unknown-linux-musleabihf.tar.gz | ARMv6 MUSL Linux (Hardfloat) | checksum
uv-armv7-unknown-linux-musleabihf.tar.gz | ARMv7 MUSL Linux | checksumuv-build 0.6.7
Install uv-build 0.6.7
Install prebuilt binaries via shell script
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/astral-sh/uv/releases/download/0.6.7/uv-build-installer.sh | sh
Install prebuilt binaries via powershell script
powershell -ExecutionPolicy Bypass -c "irm https://github.com/astral-sh/uv/releases/download/0.6.7/uv-build-installer.ps1 | iex"
Download uv-build 0.6.7
File | Platform | Checksum
---|---|---
uv-build-aarch64-apple-darwin.tar.gz | Apple Silicon macOS | checksum
uv-build-x86_64-apple-darwin.tar.gz | Intel macOS | checksum
uv-build-aarch64-pc-windows-msvc.zip | ARM64 Windows | checksum
uv-build-i686-pc-windows-msvc.zip | x86 Windows | checksum
uv-build-x86_64-pc-windows-msvc.zip | x64 Windows | checksum
uv-build-aarch64-unknown-linux-gnu.tar.gz | ARM64 Linux | checksum
uv-build-i686-unknown-linux-gnu.tar.gz | x86 Linux | checksum
uv-build-powerpc64-unknown-linux-gnu.tar.gz | PPC64 Linux | checksum
uv-build-powerpc64le-unknown-linux-gnu.tar.gz | PPC64LE Linux | checksum
uv-build-s390x-unknown-linux-gnu.tar.gz | S390x Linux | checksum
uv-build-x86_64-unknown-linux-gnu.tar.gz | x64 Linux | checksum
uv-build-armv7-unknown-linux-gnueabihf.tar.gz | ARMv7 Linux | checksum
uv-build-aarch64-unknown-linux-musl.tar.gz | ARM64 MUSL Linux | checksum
uv-build-i686-unknown-linux-musl.tar.gz | x86 MUSL Linux | checksum
uv-build-x86_64-unknown-linux-musl.tar.gz | x64 MUSL Linux | checksum
uv-build-arm-unknown-linux-musleabihf.tar.gz | ARMv6 MUSL Linux (Hardfloat) | checksum
uv-build-armv7-unknown-linux-musleabihf.tar.gz | ARMv7 MUSL Linux | checksum -
🔗 Aider-AI/aider v0.77.2.dev release
set version to 0.77.2.dev
-
🔗 Aider-AI/aider v0.77.1 release
version bump to 0.77.1
-
🔗 Rust Blog Announcing Rust 1.85.1 rss
The Rust team has published a new point release of Rust, 1.85.1. Rust is a programming language that is empowering everyone to build reliable and efficient software.
If you have a previous version of Rust installed via rustup, getting Rust 1.85.1 is as easy as:
rustup update stable
If you don't have it already, you can get
rustup
from the appropriate page on our website.[](https://blog.rust-lang.org/2025/03/18/Rust-1.85.1.html#whats-
in-1851)What's in 1.85.1
[](https://blog.rust-lang.org/2025/03/18/Rust-1.85.1.html#fixed-combined-
doctest-compilation)Fixed combined doctest compilation
Due to a bug in the implementation, combined doctests did not work as intended in the stable 2024 Edition. Internal errors with feature stability caused rustdoc to automatically use its "unmerged" fallback method instead, like in previous editions.
Those errors are now fixed in 1.85.1, realizing the performance improvement of combined doctest compilation as intended! See the backport issue for more details, including the risk analysis of making this behavioral change in a point release.
[](https://blog.rust-lang.org/2025/03/18/Rust-1.85.1.html#other-
fixes)Other fixes
1.85.1 also resolves a few regressions introduced in 1.85.0:
- Relax some
target_feature
checks when generating docs. - Fix errors in
std::fs::rename
on Windows 1607. - Downgrade bootstrap
cc
to fix custom targets. - Skip submodule updates when building Rust from a source tarball.
[](https://blog.rust-lang.org/2025/03/18/Rust-1.85.1.html#contributors-
to-1851)Contributors to 1.85.1
Many people came together to create Rust 1.85.1. We couldn't have done it without all of you. Thanks!
- Relax some
-
🔗 Baby Steps Rust in 2025: Language interop and the extensible compiler rss
For many years, C has effectively been the "lingua franca" of the computing world. It's pretty hard to combine code from two different programming languages in the same process-unless one of them is C. The same could theoretically be true for Rust, but in practice there are a number of obstacles that make that harder than it needs to be. Building out silky smooth language interop should be a core goal of helping Rust to target foundational applications. I think the right way to do this is not by extending rustc with knowledge of other programming languages but rather by building on Rust's core premise of being an extensible language. By investing in building out an " extensible compiler" we can allow crate authors to create a plethora of ergonomic, efficient bridges between Rust and other languages.
We'll know we've succeeded when…
When it comes to interop…
- It is easy to create a Rust crate that can be invoked from other languages and across multiple environments (desktop, Android, iOS, etc). Rust tooling covers the full story from writing the code to publishing your library.
- It is easy1 to carve out parts of an existing codebase and replace them with Rust. It is particularly easy to integrate Rust into C/C++ codebases.
When it comes to extensibility…
- Rust is host to wide variety of extensions ranging from custom lints and diagnostics ("clippy as a regular library") to integration and interop (ORMs, languages) to static analysis and automated reasoning^[math].
Lang interop: the least common denominator use case
In my head, I divide language interop into two core use cases. The first is what I call Least Common Denominator (LCD), where people would like to write one piece of code and then use it in a wide variety of environments. This might mean authoring a core SDK that can be invoked from many languages but it also covers writing a codebase that can be used from both Kotlin (Android) and Swift (iOS) or having a single piece of code usable for everything from servers to embedded systems. It might also be creating WebAssembly components for use in browsers or on edge providers.
What distinguishes the LCD use-case is two things. First, it is primarily unidirectional--calls mostly go from the other language to Rust. Second, you don't have to handle all of Rust. You really want to expose an API that is "simple enough" that it can be expressed reasonably idiomatically from many other languages. Examples of libraries supporting this use case today are uniffi and diplomat. This problem is not new, it's the same basic use case that WebAssembly components are targeting as well as old school things like COM and CORBA (in my view, though, each of those solutions is a bit too narrow for what we need).
When you dig in, the requirements for LCD get a bit more complicated. You want to start with simple types, yes, but quickly get people asking for the ability to make the generated wrapper from a given language more idiomatic. And you want to focus on calls into Rust, but you also need to support callbacks. In fact, to really integrate with other systems, you need generic facilities for things like logs, metrics, and I/O that can be mapped in different ways. For example, in a mobile environment, you don't necessarily want to use tokio to do an outgoing networking request. It is better to use the system libraries since they have special cases to account for the quirks of radio-based communication.
To really crack the LCD problem, you also have to solve a few other problems too:
- It needs to be easy to package up Rust code and upload it into the appropriate package managers for other languages. Think of a tool like maturin, which lets you bundle up Rust binaries as Python packages.
- For some use cases, download size is a very important constraint. Optimizing for size right now is hard to start. What's worse, your binary has to include code from the standard library, since we can't expect to find it on the device--and even if we could, we couldn't be sure it was ABI compatible with the one you built your code with.
Needed: the "serde" of language interop
Obviously, there's enough here to keep us going for a long time. I think the place to start is building out something akin to the "serde" of language interop: the serde package itself just defines the core trait for serialization and a derive. All of the format- specific details are factored out into other crates defined by a variety of people.
I'd like to see a universal set of conventions for defining the "generic API" that your Rust code follows and then a tool that extracts these conventions and hands them off to a backend to do the actual language specific work. It's not essential, but I think this core dispatching tool should live in the rust- lang org. All the language-specific details, on the other hand, would live in crates.io as crates that can be created by anyone.
Lang interop: the "deep interop" use case
The second use case is what I call the deep interop problem. For this use case, people want to be able to go deep in a particular language. Often this is because their Rust program needs to invoke APIs implemented in that other language, but it can also be that they want to stub out some part of that other program and replace it with Rust. One common example that requires deep interop is embedded developers looking to invoke gnarly C/C++ header files supplied by vendors. Deep interop also arises when you have an older codebase, such as the Rust for Linux project attempting to integrate Rust into their kernel or companies looking to integrate Rust into their existing codebases, most commonly C++ or Java.
Some of the existing deep interop crates focus specifically on the use case of invoking APIs from the other language (e.g., bindgen and duchess) but most wind up supporting bidirectional interaction (e.g., pyo3, [npapi- rs][], and neon). One interesting example is cxx, which supports bidirectional Rust-C++ interop, but does so in a rather opinionated way, encouraging you to make use of a subset of C++'s features that can be readily mapped (in this way, it's a bit of a hybrid of LCD and deep interop).
Interop with all languages is important. C and C++ are just more so.
I want to see smooth interop with all languages, but C and C++ are particularly important. This is because they have historically been the language of choice for foundational applications, and hence there is a lot of code that we need to integrate with. Integration with C today in Rust is, in my view, "ok" - most of what you need is there, but it's not as nicely integrated into the compiler or as accessible as it should be. Integration with C++ is a huge problem. I'm happy to see the Foundation's Rust-C++ Interoperability Initiative as well a projects like Google's crubit and of course the venerable cxx.
Needed: "the extensible compiler"
The traditional way to enable seamless interop with another language is to "bake it in" i.e., Kotlin has very smooth support for invoking Java code and Swift/Zig can natively build C and C++. I would prefer for Rust to take a different path, one I call the extensible compiler. The idea is to enable interop via, effectively, supercharged procedural macros that can integrate with the compiler to supply type information, generate shims and glue code, and generally manage the details of making Rust "play nicely" with another language.
In some sense, this is the same thing we do today. All the crates I mentioned above leverage procedural macros and custom derives to do their job. But procedural macrods today are the "simplest thing that could possibly work": tokens in, tokens out. Considering how simplistic they are, they've gotten us remarkably, but they also have distinct limitations. Error messages generated by the compiler are not expressed in terms of the macro input but rather the Rust code that gets generated, which can be really confusing; macros are not able to access type information or communicate information between macro invocations; macros cannot generate code on demand, as it is needed, which means that we spend time compiling code we might not need but also that we cannot integrate with monomorphization. And so forth.
I think we should integrate procedural macros more deeply into the compiler.2 I'd like macros that can inspect types, that can generate code in response to monomorphization, that can influence diagnostics3 and lints, and maybe even customize things like method dispatch rules. That will allow all people to author crates that provide awesome interop with all those languages, but it will also help people write crates for all kinds of other things. To get a sense for what I'm talking about, check out F#'s type providers and what they can do.
The challenge here will be figuring out how to keep the stabilization surface area as small as possible. Whenever possible I would look for ways to have macros communicate by generating ordinary Rust code, perhaps with some small tweaks. Imagine macros that generate things like a "virtual function", that has an ordinary Rust signature but where the body for a particular instance is constructed by a callback into the procedural macro during monomorphization. And what format should that body take? Ideally, it'd just be Rust code, so as to avoid introducing any new surface area.
Not needed: the Rust Evangelism Task Force
So, it turns out I'm a big fan of Rust. And, I ain't gonna lie, when I see a prominent project pick some other language, at least in a scenario where Rust would've done equally well, it makes me sad. And yet I also know that if every project were written in Rust, that would be so sad. I mean, who would we steal good ideas from?
I really like the idea of focusing our attention on making Rust work well with other languages , not on convincing people Rust is better 4. The easier it is to add Rust to a project, the more people will try it - and if Rust is truly a better fit for them, they'll use it more and more.
Conclusion: next steps
This post pitched out a north star where
- a single Rust library can be easily used across many languages and environments;
- Rust code can easily call and be called by functions in other languages;
- this is all implemented atop a rich procedural macro mechanism that lets plugins inspect type information, generate code on demand, and so forth.
How do we get there? I think there's some concrete next steps:
- Build out, adopt, or extend an easy system for producing "least common denominator" components that can be embedded in many contexts.
- Support the C++ interop initiatives at the Foundation and elsewhere. The wheels are turning: tmandry is the point-of-contact for project goal for that, and we recently held our first lang-team design meeting on the topic (this document is a great read, highly recommended!).
- Look for ways to extend proc macro capabilities and explore what it would take to invoke them from other phases of the compiler besides just the very beginning.
- An aside: I also think we should extend rustc to support compiling proc macros to web-assembly and use that by default. That would allow for strong sandboxing and deterministic execution and also easier caching to support faster build times.
-
Well, as easy as it can be. ↩︎
-
Rust's incremental compilation system is pretty well suited to this vision. It works by executing an arbitrary function and then recording what bits of the program state that function looks at. The next time we run the compiler, we can see if those bits of state have changed to avoid re-running the function. The interesting thing is that this function could as well be part of a procedural macro, it doesn't have to be built-in to the compiler. ↩︎
-
Stuff like the
diagnostics
tool attribute namespace is super cool! More of this! ↩︎ -
I've always been fond of this article Rust vs Go, "Why they're better together". ↩︎
-
- March 17, 2025
-
🔗 sacha chua :: living an awesome life Org Mode: Merge top-level items in an item list rss
I usually summarize Mastodon links, move them to my Emacs News Org file, and then categorize them. Today I accidentically categorized the links while they were still in my Mastodon buffer, so I had two lists with categories. I wanted to write some Emacs Lisp to merge sublists based on the top-level items. I could sort the list alphabetically with
C-c ^
(org-sort) and then delete the redundant top-level item lines, but it's fun to tinker with Emacs Lisp.Example input:
- Topic A:
- Item 1
- Item 2
- Item 2.1
- Topic B:
- Item 3
- Topic A:
- Item 4
- Item 4.1
- Item 4
Example output:
- Topic B:
- Item 3
- Topic A:
- Item 1
- Item 2
- Item 2.1
- Item 4
- Item 4.1
The sorting doesn't particularly matter to me, but I want the things under Topic A to be combined. Someday it might be nice to recursively merge other entries (ex: if there's another "Topic A: - Item 2" subitem like "Item 2.2"), but I don't need that yet.
Anyway, we can parse the list with
org-list-to-lisp
(which can even delete the original list) and recreate it withorg-list-to-org
, so then it's a matter of transforming the data structure.(defun my-org-merge-list-entries-at-point () "Merge entries in a nested Org Mode list at point that have the same top-level item text." (interactive) (save-excursion (let* ((list-indentation (save-excursion (goto-char (caar (org-list-struct))) (current-indentation))) (list-struct (org-list-to-lisp t)) (merged-list (my-org-merge-list-entries list-struct))) (insert (org-ascii--indent-string (org-list-to-org merged-list) list-indentation) "\n")))) (defun my-org-merge-list-entries (list-struct) "Merge an Org list based on its top-level headings" (cons (car list-struct) (mapcar (lambda (g) (list (car g) (let ((list-type (car (car (cdr (car (cdr g)))))) (entries (seq-mapcat #'cdar (mapcar #'cdr (cdr g))))) (apply #'append (list list-type) entries nil)))) (seq-group-by #'car (cdr list-struct)))))
A couple of test cases:
(ert-deftest my-org-merge-list-entries () (should (equal (my-org-merge-list-entries '(unordered ("Topic B:" (unordered ("Item 3"))))) '(unordered ("Topic B:" (unordered ("Item 3")))))) (should (equal (my-org-merge-list-entries '(unordered ("Topic B:" (unordered ("Item 3"))) ("Topic A:" (unordered ("Item 1") ("Item 2" (unordered ("Item 2.1"))))) ("Topic A:" (unordered ("Item 4" (unordered ("Item 4.1"))))))) '(unordered ("Topic B:" (unordered ("Item 3"))) ("Topic A:" (unordered ("Item 1") ("Item 2" (unordered ("Item 2.1"))) ("Item 4" (unordered ("Item 4.1")))))))))
Updating my custom links to also export to Org
Because
org-list-to-org
uses the Org conversion process, I need to make sure that my custom link functions also export to Org as a format. For example, in Emacs News, I use a package: link to make it easy to link to packages in both Emacs and in exported HTML. When I first ran my code, the links got replaced with their URLs, which isn't what I wanted. Turned out that I needed to add a case handling exporting toorg
format, like this:(defun my-org-package-export (link description format &optional arg) (let* ((package-info (car (assoc-default (intern link) package-archive-contents))) (package-source (and package-info (package-desc-archive package-info))) (path (format (cond ((null package-source) link) ((string= package-source "gnu") "https://elpa.gnu.org/packages/%s.html") ((string= package-source "melpa") "https://melpa.org/#/%s") ((string= package-source "nongnu") "https://elpa.nongnu.org/nongnu/%s.html") (t (error 'unknown-source))) link)) (desc (or description link))) (if package-source (cond ((eq format '11ty) (format "<a target=\"_blank\" href=\"%s\">%s</a>" path desc)) ((eq format 'html) (format "<a target=\"_blank\" href=\"%s\">%s</a>" path desc)) ((eq format 'wp) (format "<a target=\"_blank\" href=\"%s\">%s</a>" path desc)) ((eq format 'latex) (format "\\href{%s}{%s}" path desc)) ((eq format 'texinfo) (format "@uref{%s,%s}" path desc)) ((eq format 'ascii) (format "%s <%s>" desc path)) ((eq format 'org) (org-link-make-string (concat "package:" link) description)) ;; added this line (t path)) desc)))
- Topic A:
-
🔗 sacha chua :: living an awesome life 2025-03-17 Emacs news rss
- Upcoming events (iCal file, Org):
- M-x Research: TBA https://m-x-research.github.io/ Wed Mar 19 0900 America/Vancouver - 1100 America/Chicago - 1200 America/Toronto - 1600 Etc/GMT - 1700 Europe/Berlin - 2130 Asia/Kolkata – Thu Mar 20 0000 Asia/Singapore
- Emacs APAC: Emacs APAC meetup (virtual) https://emacs-apac.gitlab.io/announcements/ Sat Mar 22 0130 America/Vancouver - 0330 America/Chicago - 0430 America/Toronto - 0830 Etc/GMT - 0930 Europe/Berlin - 1400 Asia/Kolkata - 1630 Asia/Singapore
- EmacsSF (in person): coffee.el in SF https://www.meetup.com/emacs-sf/events/306610734/ Sat Mar 22 1100 America/Los_Angeles
- Emacs Berlin (hybrid, in English) https://emacs-berlin.org/ Wed Mar 26 1030 America/Vancouver - 1230 America/Chicago - 1330 America/Toronto - 1730 Etc/GMT - 1830 Europe/Berlin - 2300 Asia/Kolkata – Thu Mar 27 0130 Asia/Singapore
- Emacs 30:
- Beginner:
- Editing Files with Emacs (11:29)
- GNU Emacs má 40 let (Tomáš Čech) (23:23)
- Emacs என்பது யாதெனில் - அத்தியாயம் ஒன்று - Sakhil (01:00:55)
- Emacs configuration:
- Emacs Lisp:
- New package raq.el: HTTP Library Adapter for Emacs, suport url.el and plz.el, and can be extended. (Reddit)
- compile-angel - Ensure all Elisp files are both Byte and Native-Compiled (Alternative to: auto-compile) - Release 1.0.6 (Reddit)
- tip about get-mru-window (most recently used)
- Some problems of modernizing Emacs (incomplete - slides 0 to 6 only) (20:39)
- Appearance:
- Tip about using consult-theme from the consult package to preview themes quickly
- Announcing Calle 24 (Reddit, Irreal) - substitutes the default tool bar icons with those from SF Symbols, a library of images provided by Apple
- tomorrow-night-deepblue-theme (Release 1.2.1): A deep blue Emacs theme, inspired by the Tomorrow Night theme
- My Unique Emacs Theme Pack – Now Available for Download! (Reddit) - screenshots are in the Details heading
- Navigation:
- Noether: global minor mode managing user-defined posframe Views (Reddit)
- easysession.el 1.1.3: Persist and restore Emacs sessions including frames, tab-bar, buffers, indirect buffers, Dired, and window splits (Reddit)
- The Emacs recursive narrow package #coding #programming (02:41)
- A tutorial on 2 Columns in Emacs #emacs #coding #programming (11:32)
- Writing:
- Hugo-Heagren/clever-cite: format quoted text as you kill/yank between buffers (@HugoHeagren@scholar.social)
- quick-sdcv.el (Release 1.0.1): Turn Emacs into an offline dictionary with sdcv (Reddit)
- Editing Overleaf Documents with Emacs | Valentin Boettcher (Reddit) - nice! live editing, might be useful for collaboration too
- Chung-hong Chan: Miscellaneous talk #45 Quarto guy, Overleaf’s markdown mode?, Flymake, Mailing lists, CRAN - ESS also
- Wherein I Explain Why Emacs Is The Best Tool For WordPress (Reddit)
- Meet Harper | A Grammarly Alternative for Neovim | Emacs, Obsidian, Zed, VScode and Helix (deez) (21:16)
- Org Mode:
- Using Emacs Org mode to manage my appointments
- Calendar.org (Reddit)
- Org-anizing My Fragrance Collection with Emacs – literatelisp.eu (@theesm@social.tchncs.de)
- Sacha Chua: Remove open Org Mode clock entries
- a template for writing OWL ontologies as Org Mode documents, with supporting functions and scripts
- org-expose-emphasis-markers: A new package used to automatically show hidden emphasis markers at point in org mode when `org-hide-emphasis-markers` is on. (Reddit)
- Seam: personal wiki system based on Org mode (@spnw@gts.plexwave.org)
- Using an Org-mode README on SourceHut (@tiang@mastodon.social)
- Denote:
- Coding:
- using hideshow minor mode in all programming modes so you can toggle code sections (@dotemacs@mastodon.xyz)
- Announcing Casual Make (Reddit, Irreal)
- eglot-inactive-regions - eglot extension to dim inactive preprocessor branches - release: 0.6.4 (Reddit)
- I found an easy way to make code comments appear in other mode's syntax
- A story of mystery involving LS, clangd and a very easy fix
- flymake-bashate.el (1.0.2) - A Flymake backend for bashate: Real-time style checking for Bash shell scripts
- Bozhidar Batsov: neocaml: a new Emacs package for OCaml programming
- swyddfa/esbonio.el: Integrating the esbonio language server into Emacs (@alcarney@fosstodon.org)
- Mail, news, and chat:
- AI:
- gptel 0.9.8 released (tool-use, support for "reasoning" output, dry-run options and more) (Reddit)
- gptel-aibo update: new complete at point
- GitHub - rajp152k/fabric-gpt.el: Fabric Prompts for emacs gpt.el
- Aidermacs v1.0 Released. Available Now on Melpa and Non-GNU Elpa! (Reddit) (also 0.5.0 discussion)
- James Dyer: Ollama-Buddy 0.8.0 - Added System Prompts, Model Info and simpler menu model assignment
- James Dyer: Ollama-Buddy 0.7.1 - Org-mode Chat, Parameter Control and JSON Debugging
- claude-code.el (Reddit)
- lizqwerscott/mcp.el: An Mcp client inside Emacs (Reddit) - model context protocol servers
- Introducing forge-llm: Generate PR descriptions automatically with LLMs in Emacs Forge
- Community:
- Other:
- Emacs development:
- emacs-devel:
- Improving the Tools menu
- Re: Semantic: update or remove?
- Re: Semantic: update or remove? - more thoughts on tree-sitter and the Emacs ecosystem
- Re: Markers in a gap array - sorted array with gap
- Make marking conflicted files as resolved upon saving opt-out
- ; Add NEWS entry for java-ts-mode-method-chaining-indent-offset
- ; etc/NEWS (remember-prefix-map): Suggest a key reserved to users.
- New project-save-some-buffers command
- Add a new command `speedbar-window'.
- dired-copy-filename-as-kill: Support project-relative names
- Make turn-on-flyspell/turn-off-flyspell obsolete
- Improve tramp-*-with-sudo commands
- ; * etc/NEWS: Announce the larger number of sub-processes on w32.
- New configure option –with-systemduserunitdir
- Allow control of indicating empty rectangular selections
- Turn 'remember-mode' into a minor mode
- diff-apply-buffer now considers the region and can reverse-apply.
- Fix capitalization ELisp -> Elisp
- Automatically document when setopt is needed
- New user option follow-mode-prefix-key
- Remove variable aliases obsolete since Emacs 23.2
- New user variable `exchange-point-and-mark-highlight-region`
- emacs-devel:
- New packages:
- aider: Interact with Aider: AI pair programming made simple (MELPA)
- company-forge: Company backend for assignees and topics from forge (MELPA)
- denote-journal: Convenience functions for daily journaling with Denote (GNU ELPA)
- denote-markdown: Extensions that better integrate Denote with Markdown (GNU ELPA)
- denote-org: Denote extensions for Org mode (GNU ELPA)
- denote-sequence: Sequence notes or Folgezettel with Denote (GNU ELPA)
- denote-silo: Convenience functions for using Denote in multiple silos (GNU ELPA)
- indexed: Cache metadata on all Org files (MELPA)
- jira: Emacs Interface to Jira (MELPA)
- ob-aider: Org Babel functions for Aider.el integration (MELPA)
- org-expose-emphasis-markers: Automatically show hidden org emphasis markers (MELPA)
Links from reddit.com/r/emacs, r/orgmode, r/spacemacs, r/planetemacs, Mastodon #emacs, Bluesky #emacs, Hacker News, lobste.rs, programming.dev, lemmy.world, lemmy.ml, planet.emacslife.com, YouTube, the Emacs NEWS file, Emacs Calendar, and emacs-devel. Thanks to Andrés Ramírez for emacs-devel links. Do you have an Emacs-related link or announcement? Please e-mail me at sacha@sachachua.com. Thank you!
- Upcoming events (iCal file, Org):
-