🏡


  1. About KeePassXC’s Code Quality Control – KeePassXC
  2. How to build a remarkable command palette
  3. Leaderboard - compar:IA, the AI chatbot arena
  4. Who needs Graphviz when you can build it yourself? | SpiderMonkey JavaScript/WebAssembly Engine
  5. Automerge

  1. November 25, 2025
    1. 🔗 gulbanana/gg GG 0.35.1 release

      Added

      • New config option gg.ui.track-recent-workspaces, which can be set to false to disable saving recent workspaces to the config file.

      Fixed

      • Another memory leak (failure to deregister RAF callbacks).
      • Some broken graph rendering (which was relying on the previous leak!).
    2. 🔗 gulbanana/gg GG 0.35.0 release

      This release is based on Jujutsu 0.35.

      Fixed

      • Memory leak in the log pane (thanks to @brk).
    3. 🔗 HexRaysSA/plugin-repository commits sync repo: +2 plugins, +2 releases rss
      sync repo: +2 plugins, +2 releases
      
      ## New plugins
      - [funcfiletree](https://github.com/rand-tech/idaplugins) (1.0)
      - [navigator](https://github.com/rand-tech/idaplugins) (1.3)
      
    4. 🔗 syncthing/syncthing v2.0.12-rc.1 release

      Major changes in 2.0

      • Database backend switched from LevelDB to SQLite. There is a migration on
        first launch which can be lengthy for larger setups. The new database is
        easier to understand and maintain and, hopefully, less buggy.

      • The logging format has changed to use structured log entries (a message
        plus several key-value pairs). Additionally, we can now control the log
        level per package, and a new log level WARNING has been inserted between
        INFO and ERROR (which was previously known as WARNING...). The INFO level
        has become more verbose, indicating the sync actions taken by Syncthing. A
        new command line flag --log-level sets the default log level for all
        packages, and the STTRACE environment variable and GUI has been updated
        to set log levels per package. The --verbose and --logflags command
        line options have been removed and will be ignored if given.

      • Deleted items are no longer kept forever in the database, instead they are
        forgotten after fifteen months. If your use case require deletes to take
        effect after more than a fifteen month delay, set the
        --db-delete-retention-interval command line option or corresponding
        environment variable to zero, or a longer time interval of your choosing.

      • Modernised command line options parsing. Old single-dash long options are
        no longer supported, e.g. -home must be given as --home. Some options
        have been renamed, others have become subcommands. All serve options are
        now also accepted as environment variables. See syncthing --help and
        syncthing serve --help for details.

      • Rolling hash detection of shifted data is no longer supported as this
        effectively never helped. Instead, scanning and syncing is faster and more
        efficient without it.

      • A "default folder" is no longer created on first startup.

      • Multiple connections are now used by default between v2 devices. The new
        default value is to use three connections: one for index metadata and two
        for data exchange.

      • The following platforms unfortunately no longer get prebuilt binaries for
        download at syncthing.net and on GitHub, due to complexities related to
        cross compilation with SQLite:

        • dragonfly/amd64
        • solaris/amd64
        • linux/ppc64
        • netbsd/*
        • openbsd/386 and openbsd/arm
        • windows/arm
        • The handling of conflict resolution involving deleted files has changed. A
          delete can now be the winning outcome of conflict resolution, resulting in
          the deleted file being moved to a conflict copy.

      This release is also available as:

      • APT repository: https://apt.syncthing.net/

      • Docker image: docker.io/syncthing/syncthing:2.0.12-rc.1 or ghcr.io/syncthing/syncthing:2.0.12-rc.1
        ({docker,ghcr}.io/syncthing/syncthing:2 to follow just the major version)

      What's Changed

      Other

      • chore: update quic-go, adapt to lack of write tracking by @calmh in #10456
      • chore(cli): clean up generated usage strings for config commands (fixes #10462) by @acolomb in #10463

      Full Changelog : v2.0.11...v2.0.12-rc.1

    5. 🔗 Rust Blog Interview with Jan David Nose rss

      On the Content Team, we had our first whirlwind outing at RustConf 2025 in Seattle, Washington, USA. There we had a chance to speak with folks about interesting things happening in the Project and the wider community.

      Jan David Nose, Infrastructure Team

      In this interview, Xander Cesari sits down with Jan David Nose, then one of the full-time engineers on the Infrastructure Team, which maintains and develops the infrastructure upon which Rust is developed and deployed -- including CI/CD tooling and crates.io.

      We released this video on an accelerated timeline, some weeks ago, in light of the recent software supply chain attacks, but the interview was conducted prior to the news of compromised packages in other languages and ecosystems.

      Check out the interview here or click below.


      Transcript

      Xander Cesari : Hey, this is Xander Cesari with the Rust Project Content Team, recording on the last hour of the last day of RustConf 2025 here in Seattle. So it's been a long and amazing two days. And I'm sitting down here with a team member from the Rust Project Infra Team, the unsung heroes of the Rust language. Want to introduce yourself and kind of how you got involved?

      Jan David Nose : Yeah, sure. I'm JD. Jan David is the full name, but especially in international contexts, I just go with JD. I've been working for the Rust Foundation for the past three years as a full-time employee and I essentially hit the jackpot to work full-time on open source and I've been in the Infra Team of the Rust Project for the whole time. For the past two years I've led the team together with Jake. So the Infra Team is kind of a thing that lets Rust happen and there's a lot of different pieces.

      Xander Cesari : Could you give me an overview of the responsibility of the Infra Team?

      Jan David Nose : Sure. I think on a high level, we think about this in terms of, we serve two different groups of people. On one side, we have users of the language, and on the other side, we really try to provide good tooling for the maintainers of the language.

      Jan David Nose : Starting with the maintainer side, this is really everything about how Rust is built. From the moment someone makes a contribution or opens a PR, we maintain the continuous integration that makes sure that the PR actually works. There's a lot of bots and tooling helping out behind the scenes to kind of maintain a good status quo, a sane state. Lots of small things like triage tools on GitHub to set labels and ping people and these kinds of things. And that's kind of managed by the Infra Team at large.

      Jan David Nose : And then on the user side, we have a lot of, or the two most important things are making sure users can actually download Rust. We don't develop crates.io, but we support the infrastructure to actually ship crates to users. All the downloads go through content delivery networks that we provide. The same for Rust releases. So if I don't do my job well, which has happened, there might be a global outage of crates.io and no one can download stuff. But those are kind of the two different buckets of services that we run and operate.

      Xander Cesari : Gotcha. So on the maintainer side, the Rust organization on GitHub is a large organization with a lot of activity, a lot of code. There's obviously a lot of large code bases being developed on GitHub, but there are not that many languages the size of Rust being developed on GitHub. Are there unique challenges to developing a language and the tooling that's required versus developing other software projects?

      Jan David Nose : I can think of a few things that have less to do with the language specifically, but with some of the architecture decisions that were made very early on in the life cycle of Rust. So one of the things that actually caused a lot of headache for mostly GitHub, and then when they complained to us, for us as well, is that for a long, long time, the index for crates.io was a Git repo on GitHub. As Rust started to grow, the activity on the repo became so big that it actually caused some issues, I would say, in a friendly way on GitHub, just in terms of how much resources that single repository was consuming. That then kind of started this work on a web-based, HTTP-based index to shift that away. That's certainly one area where we've seen how Rust has struggled a little bit with the platform, but also the platform provider struggled with us.

      Jan David Nose : I think for Rust itself, especially when we look at CI, we really want to make sure that Rust works well on all of the targets and all the platforms we support. That means we have an extremely wide CI pipeline where, for every Tier 1 target, we want to run all the tests, we want to build the release artifacts, we want to upload all of that to S3. We want to do as much as we reasonably can for Tier 2 targets and, to a lesser extent, maybe even test some stuff on Tier 3. That has turned into a gigantic build pipeline. Marco gave a talk today on what we've done with CI over the last year. One of the numbers that came out of doing the research for this talk is that we accumulate over three million build minutes per month, which is about six years of CPU time every month.

      Jan David Nose : Especially when it comes to open source projects, I think we're one of the biggest consumers of GitHub Actions in that sense. Not the biggest in total; there are definitely bigger commercial projects. But that's a unique challenge for us to manage because we want to provide as good a service as we can to the community and make sure that what we ship is high quality. That comes at a huge cost in terms of scaling. As Rust gets more popular and we want to target more and more platforms, this is like a problem that just continues to grow.

      Jan David Nose : We'll probably never remove a lot of targets, so there's an interesting challenge to think about. If it's already big now, how does this look in 5 years, 10 years, 15 years, and how can we make sure we can maintain the level of quality we want to ship? When you build and run for a target in the CI pipeline, some of those Tier 1 targets you can just ask a cloud service provider to give you a VM running on that piece of hardware, but some of them are probably not things that you can just run in the cloud.

      Xander Cesari : Is there some HIL (Hardware-In-the-Loop) lab somewhere?

      Jan David Nose : So you're touching on a conversation that's happening pretty much as we speak. So far, as part of our target tier policy, there is a clause that says it needs to be able to run in CI. That has meant being very selective about only promoting things to Tier 1 that we can actually run and test. For all of this, we had a prerequisite that it runs on GitHub Actions. So far we've used very little hardware that is not natively supported or provided by GitHub.

      Jan David Nose : But this is exactly the point with Rust increasing in popularity. We just got requests to support IBM platforms and RISC-V, and those are not natively supported on GitHub. That has kicked off an internal conversation about how we even support this. How can we as a project enable companies that can provide us hardware to test on? What are the implications of that?

      Jan David Nose : On one side, there are interesting constraints and considerations. For example, you don't want your PRs to randomly fail because someone else's hardware is not available. We're already so resource- constrained on how many PRs we can merge each day that adding noise to that process would really slow down contributions to Rust. On the other side, there are security implications. Especially if we talk about promoting something to Tier 1 and we want to build release artifacts on that hardware, we need to make sure that those are actually secure and no one sneaks a back door into the Rust compiler target for RISC-V.

      Jan David Nose : So there are interesting challenges for us, especially in the world we live in where supply chain security is a massive concern. We need to figure out how we can both support the growth of Rust and the growth of the language, the community, and the ecosystem at large while also making sure that the things we ship are reliable, secure, and performant. That is becoming an increasingly relevant and interesting piece to work on. So far we've gotten away with the platforms that GitHub supports, but it's really cool to see that this is starting to change and people approach us and are willing to provide hardware, provide sponsorship, and help us test on their platforms. But essentially we don't have a good answer for this yet. We're still trying to figure out what this means, what we need to take into consideration, and what our requirements are to use external hardware.

      Xander Cesari : Yeah, everyone is so excited about Rust will run everywhere, but there's a maintenance cost there that is almost exponential in scope.

      Jan David Nose : It's really interesting as well because there's a tension there. I think with IBM, for example, approaching us, it's an interesting example. Who has IBM platforms at home? The number of users for that platform is really small globally, but IBM also invests heavily in Rust, tries to make this happen, and is willing to provide the hardware.

      Jan David Nose : For us, that leads to a set of questions. Is there a line? Is there a certain requirement? Is there a certain amount of usage that a platform would need for us to promote it? Or do we say we want to promote as much as we can to Tier 1? This is a conversation we haven't really had to have yet. It's only now starting to creep in as Rust is adopted more widely and companies pour serious money and resources into it. That's exciting to see.

      Jan David Nose : In this specific case, companies approach the Infra Team to figure out how we can add their platforms to CI as a first step towards Tier 1 support. But it's also a broader discussion we need to have with larger parts of the Rust Project. For Tier 1 promotions, for example, the Compiler Team needs to sign off, Infra needs to sign off. Many more people need to be involved in this discussion of how we can support the growing needs of the ecosystem at large.

      Xander Cesari : I get the feeling that's going to be a theme throughout this interview.

      Jan David Nose : 100%.

      Xander Cesari : So one other tool that's part of this pipeline that I totally didn't know about for a long time, and I think a talk at a different conference clued me into it, is Crater. It's a tool that attempts to run all of the Rust code it can find on the internet. Can you talk about what that tool does and how it integrates into the release process?

      Jan David Nose : Whenever someone creates a pull request on GitHub to add a new feature or bug fix to the Rust compiler, they can start what's called a Crater run, or an experiment. Crater is effectively a large fleet of machines that tries to pull in as many crates as it can. Ideally, we would love to test all crates, but for a variety of reasons that's not possible. Some crates simply don't build reliably, so we maintain lists to exclude those. From the top of my head, I think we currently test against roughly 60% of crates.

      Jan David Nose : The experiment takes the code from your pull request, builds the Rust compiler with it, and then uses that compiler to build all of these crates. It reports back whether there are any regressions related to the change you proposed. That is a very important tool for us to maintain backwards compatibility with new versions and new features in Rust. It lets us ask: does the ecosystem still compile if we add this feature to the compiler, and where do we run into issues? Then, and this is more on the Compiler Team side, there's a decision about how to proceed. Is the breakage acceptable? Do we need to adjust the feature? Having Crater is what makes that conversation possible because it gives us real data on the impact on the wider ecosystem.

      Xander Cesari : I think that's so interesting because as more and more companies adopt Rust, they're asking whether the language is going to be stable and backward compatible. You hear about other programming languages that had a big version change that caused a lot of drama and code changes. The fact that if you have code on crates.io, the Compiler Team is probably already testing against it for backwards compatibility is pretty reassuring.

      Jan David Nose : Yeah, the chances are high, I would say. Especially looking at the whole Python 2 to Python 3 migration, I think as an industry we've learned a lot from those big version jumps. I can't really speak for the Compiler Team because I'm not a member and I wasn't involved in the decision- making, but I feel this is one of the reasons why backwards compatibility is such a big deal in Rust's design. We want to make it as painless as possible to stay current, stay up to date, and make sure we don't accidentally break the language or create painful migration points where the entire ecosystem has to move at once.

      Xander Cesari : Do you know if there are other organizations pulling in something like Crater and running it on their own internal crate repositories, maybe some of the big tech companies or other compiler developers or even other languages? Or is this really bespoke for the Rust compiler team?

      Jan David Nose : I don't know of anyone who runs Crater itself as a tool. Crater is built on a sandboxing framework that we also use in other places. For example, docs.rs uses some of the same underlying infrastructure to build all of the documentation. We try to share as much as we can of the functionality that exists in Crater, but I'm not aware of anyone using Crater in the same way we do.

      Xander Cesari : Gotcha. The other big part of your job is that the Infra Team works on supporting maintainers, but it also supports users and consumers of Rust who are pulling from crates.io. It sounds like crates.io is not directly within your team, but you support a lot of the backend there.

      Jan David Nose : Yeah, exactly. crates.io has its own team, and that team maintains the web application and the APIs. The crates themselves, all the individual files that people download, are hosted within our infrastructure. The Infra Team maintains the content delivery network that sits in front of that. Every download of a crate goes through infrastructure that we maintain. We collaborate very closely with the crates.io team on this shared interface. They own the app and the API, and we make sure that the files get delivered to the end user.

      Xander Cesari : So it sounds like there's a lot of verification of the files that get uploaded and checks every time someone pushes a new version to crates.io. That part all happens within crates.io as an application.

      Jan David Nose : Cargo uses the crates.io API to upload the crate file. crates.io has a lot of internal logic to verify that it is valid and that everything looks correct. For us, as the Infra Team, we treat that as a black box. crates.io does its work, and if it is happy with the upload, it stores the file in S3. From that point onward, infrastructure makes sure that the file is accessible and can be downloaded so people can start using your crate.

      Xander Cesari : In this theme of Rust being a bit of a victim of its own success, I assume all of the traffic graphs and download graphs are very much up and to the right.

      Jan David Nose : On the Foundation side, one of our colleagues likes to check how long it takes for one billion downloads to happen on crates.io, and that number has been falling quickly. I don't remember what it was three years ago, but it has come down by orders of magnitude. In our download traffic we definitely see exponential growth. Our traffic tends to double year over year, and that trend has been pretty stable. It really seems like Rust is getting a lot of adoption in the ecosystem and people are using it for more and more things.

      Xander Cesari : How has the Infra Team scaled with that? Are you staying ahead of it, or are there a lot of late nights?

      Jan David Nose : There have definitely been late nights. In the three years I've been working in the Infra Team, every year has had a different theme that was essentially a fire to put out.

      Jan David Nose : It changes because we fix one thing and then the next thing breaks. So far, luckily, those fires have been mostly sequential, not parallel. When I joined, bandwidth was the big topic. Over the last year, it has been more about CI. About three years ago, we hit this inflection point where traffic was doubling and the sponsorship capacity we had at the time was reaching its limits.

      Jan David Nose : Two or three years ago, Fastly welcomed us into their Fast Forward program and has been sponsoring all of our bandwidth since then. That has mostly helped me sleep at night. It has been a very good relationship. They have been an amazing partner and have helped us at every step to remove the fear that we might hit limits. They are very active in the open source community at large; most famously they also sponsor PyPI and the Python ecosystem, compared to which we're a tiny fish in a very big pond. That gives us a lot of confidence that we can sustain this growth and keep providing crates and releases at the level of quality people expect.

      Xander Cesari : In some ways, Rust did such a good job of making all of that infrastructure feel invisible. You just type Cargo commands into your terminal and it feels magical.

      Jan David Nose : I'm really happy about that. It's an interesting aspect of running an infrastructure team in open source. If you look at the ten-year history since the first stable release, or even the fifteen years since Rust really started, infrastructure was volunteer-run for most of that time. I've been here for three years, and I was the first full-time infrastructure engineer. So for ten to twelve years, volunteers ran the infrastructure.

      Jan David Nose : For them, it was crucial that things just worked, because you can't page volunteers in the middle of the night because a server caught fire or downloads stopped working. From the beginning, our infrastructure has been designed to be as simple and as reliable as possible. The same is true for our CDNs. I always feel a bit bad because Fastly is an amazing sponsor. Every time we meet them at conferences or they announce new features, they ask whether we want to use them or talk about how we use Fastly in production. And every time I have to say: we have the simplest configuration possible. We set some HTTP headers. That's pretty much it.

      Jan David Nose : It's a very cool platform, but we use the smallest set of features because we need to maintain all of this with a very small team that is mostly volunteer-based. Our priority has always been to keep things simple and reliable and not chase every fancy new technology, so that the project stays sustainable.

      Xander Cesari : Volunteer-based organizations seem to have to care about work-life balance, which is probably terrific, and there are lessons to be learned there.

      Jan David Nose : Yeah, it's definitely a very interesting environment to work in. It has different rules than corporations or commercial teams. We have to think about how much work we can do in a given timeframe in a very different way, because it's unpredictable when volunteers have time, when they're around, and what is happening in their lives.

      Jan David Nose : Over the last few years, we've tried to reduce the number of fires that can break out. And when they do happen, we try to shield volunteers from them and take that work on as full-time employees. That started with me three years ago. Last year Marco joined, which increased the capacity we have, because there is so much to do on the Infra side that even with me working full-time, we simply did not have enough people.

      Xander Cesari : So you're two full-time and everything else is volunteer.

      Jan David Nose : Exactly. The team is around eight people. Marco and I work full-time and are paid by the Rust Foundation to focus exclusively on infrastructure. Then we have a handful of volunteers who work on different things.

      Jan David Nose : Because our field of responsibility is so wide, the Infra Team works more in silos than other teams might. We have people who care deeply about very specific parts of the infrastructure. Otherwise there is simply too much to know for any one person. It has been a really nice mix, and it's amazing to work with the people on the team.

      Jan David Nose : As someone who is privileged enough to work full-time on this and has the time and resources, we try to bear the bigger burden and create a space that is fun for volunteers to join. We want them to work on exciting things where there is less risk of something catching fire, where it's easier to come in, do a piece of work, and then step away. If your personal life takes over for two weeks, that's okay, because someone is there to make sure the servers and the lights stay on.

      Jan David Nose : A lot of that work lives more on the maintainer side: the GitHub apps, the bots that help with triage. It's less risky if something goes wrong there. On the user side, if you push the wrong DNS setting, as someone might have done, you can end up in a situation where for 30 minutes no one can download crates. And in this case, "no one" literally means no user worldwide. That's not an experience I want volunteers to have. It's extremely stressful and was ultimately one of the reasons I joined in the first place—there was a real feeling of burnout from carrying that responsibility.

      Jan David Nose : It's easier to carry that as a full-timer. We have more time and more ways to manage the stress. I'm honestly extremely amazed by what the Infra Team was able to do as volunteers. It's unbelievable what they built and how far they pushed Rust to get to where we are now.

      Xander Cesari : I think anyone who's managing web traffic in 2025 is talking about traffic skyrocketing due to bots and scrapers for AI or other purposes. Has that hit the Rust network as well?

      Jan David Nose : Yeah, we've definitely seen that. It's handled by a slightly different team, but on the docs.rs side in particular we've seen crawlers hit us hard from time to time, and that has caused noticeable service degradation. We're painfully aware of the increase in traffic that comes in short but very intense bursts when crawlers go wild.

      Jan David Nose : That introduces a new challenge for our infrastructure. We need to figure out how to react to that traffic and protect our services from becoming unavailable to real users who want to use docs.rs to look up something for their work. On the CDN side, our providers can usually handle the traffic. It is more often the application side where things hurt.

      Jan David Nose : On the CDN side we also see people crawling crates.io, presumably to vacuum up the entire crates ecosystem into an LLM. Fortunately, over the last two years we've done a lot of work to make sure crates.io as an application is less affected by these traffic spikes. Downloads now bypass crates.io entirely and go straight to the CDN, so the API is not hit by these bursts. In the past, this would have looked like a DDoS attack, with so many requests from so many sources that we couldn't handle it.

      Jan David Nose : We've done a lot of backend work to keep our stack reliable, but it's definitely something that has changed the game over the last year. We can clearly see that crawlers are much more active than before.

      Xander Cesari : That makes sense. I'm sure Fastly is working on this as well. Their business has to adapt to be robust to this new internet.

      Jan David Nose : Exactly. For example, one of the conversations we're having right now is about docs.rs. It's still hosted on AWS behind CloudFront, but we're talking about putting it behind Fastly because through Fastly we get features like bot protection that can help keep crawlers out.

      Jan David Nose : This is a good example of how our conversations have changed in the last six months. At the start of the year I did not think this would be a topic we would be discussing. We were focused on other things. For docs.rs we have long-term plans to rebuild the infrastructure that powers it, and I expected us to spend our energy there. But with the changes in the industry and everyone trying to accumulate as much data as possible, our priorities have shifted. The problems we face and the order in which we tackle them have changed.

      Xander Cesari : And I assume as one of the few paid members of a mostly volunteer team, you often end up working on the fires, not the interesting next feature that might be more fun.

      Jan David Nose : That is true, although it sounds a bit negative to say I only get to work on fires. Sometimes it feels like that because, as with any technology stack, there is a lot of maintenance overhead. We definitely pay that price on the infrastructure side.

      Jan David Nose : Marco, for example, spent time this year going through all the servers we run, cataloging them, and making sure they're patched and on the latest operating system version. We updated our Ubuntu machines to the latest LTS. It feels a bit like busy work—you just have to do it because it's important and necessary, but it's not the most exciting project.

      Jan David Nose : On the other hand, when it comes to things like CDN configuration and figuring out how bot protection features work and whether they are relevant to us, that is also genuinely interesting work. It lets us play with new tools vendors provide, and we're working on challenges that the wider industry is facing. How do you deal with this new kind of traffic? What are the implications of banning bots? How high is the risk of blocking real users? Sometimes someone just misconfigures a curl script, and from the outside it looks like they're crawling our site.

      Jan David Nose : So it's an interesting field to work in, figuring out how we can use new features and address new challenges. That keeps it exciting even for us full-timers who do more of the "boring" work. We get to adapt alongside how the world around us is changing. If there's one constant, it's change.

      Xander Cesari : Another ripped-from-the-headlines change around this topic is software supply chain security, and specifically xz-utils and the conversation around open source security. How much has that changed the landscape you work in?

      Jan David Nose : The xz-utils compromise was scary. I don't want to call it a wake-up call, because we've been aware that supply chain security is a big issue and this was not the first compromise. But the way it happened felt very unsettling. You saw an actor spend a year and a half building social trust in an open source project and then using that to introduce a backdoor.

      Jan David Nose : Thinking about that in the context of Rust: every team in the project talks about how we need more maintainers, how there's too much workload on the people who are currently contributing, and how Rust's growth puts strain on the organization as a whole. We want to be an open and welcoming project, and right now we also need to bring new people in. If someone shows up and says, "I'm willing to help, please onboard me," and they stick around for a year and then do something malicious, we would be susceptible to that. I don't think this is unique to Rust. This is an inherent problem in open source.

      Xander Cesari : Yeah, it's antithetical to the culture.

      Jan David Nose : Exactly. So we're trying to think through how we, as a project and as an ecosystem, deal with persistent threat actors who have the time and resources to play a long game. Paying someone to work full-time on open source for a year is a very different threat model than what we used to worry about.

      Jan David Nose : I used to joke that the biggest threat to crates.io was me accidentally pulling the plug on a CDN. I think that has changed. Today the bigger threat is someone managing to insert malicious code into our releases, our supply chain, or crates.io itself. They could find ways to interfere with our systems in ways we're simply not prepared for, where, as a largely volunteer organization, we might be too slow to react to a new kind of attack.

      Jan David Nose : Looking back over the last three years, this shift became very noticeable, especially after the first year. Traffic was doubling, Rust usage was going up a lot, and there were news stories about Rust being used in the Windows kernel, in Android, and in parts of iOS. Suddenly Rust is everywhere. If you want to attack "everywhere," going after Rust becomes attractive. That definitely puts a target on our back and has changed the game.

      Jan David Nose : I'm very glad the Rust Foundation has a dedicated security engineer who has done a lot of threat modeling and worked with us on infrastructure security. There's also a lot of work happening specifically around the crates ecosystem and preventing supply chain attacks through crates. Luckily, it's not something the Infra side has to solve alone. But it is getting a lot more attention, and I think it will be one of the big challenges for the future: how a mostly volunteer-run project keeps up with this looming threat.

      Xander Cesari : And it is the industry at large. This is not a unique problem to the Rust package manager. All package registries, from Python to JavaScript to Nix, deal with this. Is there an industry-wide conversation about how to help each other out and share learnings?

      Jan David Nose : Yeah, there's definitely a lot happening. I have to smile a bit because, with a lot of empathy but also a bit of relief, we sometimes share news when another package ecosystem gets compromised. It is a reminder that it's not just us, sometimes it's npm this time.

      Jan David Nose : We really try to stay aware of what's happening in the industry and in other ecosystems: what new threats or attack vectors are emerging, what others are struggling with. Sometimes that is security; sometimes it's usability. A year and a half ago, for example, npm had the "everything" package where someone declared every package on npm as a dependency, which blew up the index. We look at incidents like that and ask whether crates.io would struggle with something similar and whether we need to make changes.

      Jan David Nose : On the security side we also follow closely what others are doing. In the packaging community, the different package managers are starting to come together more often to figure out which problems everyone shares. There is a bit of a joke that we're all just shipping files over the internet. Whether it's an npm package or a crate, ultimately it's a bunch of text files in a zip. So from an infrastructure perspective the problems are very similar.

      Jan David Nose : These communities are now talking more about what problems PyPI has, what problems crates.io has, what is happening in the npm space. One thing every ecosystem has seen—even the very established ones—is a big increase in bandwidth needs, largely connected to the emergence of AI. PyPI, for example, publishes download charts, and it's striking. Python had steady growth—slightly exponential, but manageable—for many years. Then a year or two ago you see a massive hockey stick. People discovered that PyPI was a great distribution system for their models. There were no file size limits at the time, so you could publish precompiled GPU models there.

      Jan David Nose : That pattern shows up everywhere. It has kicked off a new era for packaging ecosystems to come together and ask: in a time where open source is underfunded and traffic needs keep growing, how can we act together to find solutions to these shared problems? crates.io is part of those conversations. It's interesting to see how we, as an industry, share very similar problems across ecosystems—Python, npm, Rust, and others.

      Xander Cesari : With a smaller, more hobbyist-focused community, you can have relaxed rules about what goes into your package manager. Everyone knows the spirit of what you're trying to do and you can get away without a lot of hard rules and consequences. Is the Rust world going to have to think about much harder rules around package sizes, allowed files, and how you're allowed to distribute things?

      Jan David Nose : Funnily enough, we're coming at this from the opposite direction. Compared to other ecosystems, we've always had fairly strict limits. A crate can be at most around ten megabytes in size. There are limits on what kinds of files you can put in there. Ironically, those limits have helped us keep traffic manageable in this period.

      Jan David Nose : At the same time, there is a valid argument that these limits may not serve all Rust use cases. There are situations where you might want to include something precompiled in your crate because it is hard to compile locally, takes a very long time, or depends on obscure headers no one has. I don't think we've reached the final state of what the crates.io package format should look like.

      Jan David Nose : That has interesting security implications. When we talk about precompiled binaries or payloads, we all have that little voice in our head every time we see a curl | sh command: can I trust this? The same is true if you download a crate that contains a precompiled blob you cannot easily inspect.

      Jan David Nose : The Rust Foundation is doing a lot of work and research here. My colleague Adam, who works on the crates.io team, is working behind the scenes to answer some of these questions. For example: what kind of security testing can we do before we publish crates to make sure they are secure and don't contain malicious payloads? How do we surface this information? How do we tell a publisher that they included files that are not allowed? And from the user's perspective, when you visit crates.io, how can you judge how well maintained and how secure a crate is?

      Jan David Nose : Those conversations are happening quite broadly in the ecosystem. On the Infra side we're far down the chain. Ultimately we integrate with whatever security scanning infrastructure crates.io builds. We don't have to do the security research ourselves, but we do have to support it.

      Jan David Nose : There's still a lot that needs to happen. As awesome as Rust already is, and as much as I love using it, it's important to remember that we're still a very young ecosystem. Python is now very mature and stable, but it's more than 25 years old. Rust is about ten years old as a stable language. We still have a lot to learn and figure out.

      Xander Cesari : Is the Rust ecosystem running into problems earlier than other languages because we're succeeding at being foundational software and Rust is used in places that are even more security-critical than other languages, so you have to hit these hard problems earlier than the Python world did?

      Jan David Nose : I think that's true. Other ecosystems probably had more time to mature and answer these questions. We're operating on a more condensed timeline. There is also simply more happening now. Open source has been very successful; it's everywhere. That means there are more places where security is critical.

      Jan David Nose : So this comes with the success of open source, with what is happening in the ecosystem at large, and with the industry we're in. It does mean we have less time to figure some things out. On the flip side, we also have less baggage. We have less technical debt and fifteen fewer years of accumulated history. That lets us be on the forefront in some areas, like how a package ecosystem can stay secure and what infrastructure a 21st century open source project needs.

      Jan David Nose : Here I really want to call out the Rust Foundation. They actively support this work: hiring people like Marco and me to work full-time on infrastructure, having Walter and Adam focus heavily on security, and as an organization taking supply chain considerations very seriously. The Foundation also works with other ecosystems so we can learn and grow together and build a better industry.

      Jan David Nose : Behind the scenes, colleagues constantly work to open doors for us as a relatively young language, so we can be part of those conversations and sit at the table with other ecosystems. That lets us learn from what others have already gone through and also help shape where things are going. Sustainability is a big part of that: how do we fund the project long term? How do we make sure we have the human resources and financial resources to run the infrastructure and support maintainers? I definitely underestimated how much of my job would be relationship management and budget planning, making sure credits last until new ones arrive.

      Xander Cesari : Most open core business models give away the thing that doesn't cost much—the software—and charge for the thing that scales with use—the service. In Rust's case, it's all free, which is excellent for adoption, but it must require a very creative perspective on the business side.

      Jan David Nose : Yeah, and that's where different forces pull in opposite directions. As an open source project, we want everyone to be able to use Rust for free. We want great user experience. When we talk about downloads, there are ways for us to make them much cheaper, but that might mean hosting everything in a single geographic location. Then everyone, including people in Australia, would have to download from, say, Europe, and their experience would get much worse.

      Jan David Nose : Instead, we want to use services that are more expensive but provide a better experience for Rust users. There's a real tension there. On one side we want to do the best we can; on the other side we need to be realistic that this costs money.

      Xander Cesari : I had been thinking of infrastructure as a binary: it either works or it doesn't. But you're right, it's a slider. You can pick how much money you want to spend and what quality of service you get. Are there new technologies coming, either for the Rust Infra Team or the packaging world in general, to help with these security problems? New sandboxing technologies or higher-level support?

      Jan David Nose : A lot of people are working on this problem from different angles. Internally we've talked a lot about it, especially in the context of Crater. Crater pulls in all of those crates to build them and get feedback from the Rust compiler. That means if someone publishes malicious code, we will download it and build it.

      Jan David Nose : In Rust this is a particular challenge because build scripts can essentially do anything on your machine. For us that means we need strong sandboxing. We've built our own sandboxing framework so every crate build runs in an isolated container, which prevents malicious code from escaping and messing with the host systems.

      Jan David Nose : We feel that pain in Crater, but if we can solve it in a way that isn't exclusive to Crater—if it also protects user machines from the same vulnerabilities—that would be ideal. People like Walter on the Foundation side are actively working on that. I'm sure there are conversations in the Cargo and crates teams as well, because every team that deals with packages sees a different angle of the problem. We all have to come together to solve it, and there is a lot of interesting work happening in that area.

      Xander Cesari : I hope help is coming.

      Jan David Nose : I'm optimistic.

      Xander Cesari : We have this exponential curve with traffic and everything else. It seems like at some point it has to taper off.

      Jan David Nose : We'll see. Rust is a young language. I don't know when that growth will slow down. I think there's a good argument that it will continue for quite a while as adoption grows.

      Jan David Nose : Being at a conference like RustConf, it's exciting to see how the mix of companies has changed over time. We had a talk from Rivian on how they use Rust in their cars. We've heard from other car manufacturers exploring it. Rust is getting into more and more applications that a few years ago would have been hard to imagine or where the language simply wasn't mature enough yet.

      Jan David Nose : As that continues, I think we'll see new waves of growth that sustain the exponential curve we currently have, because we're moving into domains that are new for us. It's amazing to see who is talking about Rust and how they're using it, sometimes in areas like space that you wouldn't expect.

      Jan David Nose : I'm very optimistic about Rust's future. With this increase in adoption, we'll see a lot of interesting lessons about how to use Rust and a lot of creative ideas from people building with it. With more corporate adoption, I also expect a new wave of investment into the ecosystem: companies paying people to work full-time on different parts of Rust, both in the ecosystem and in the core project. I'm very curious what the next ten years will look like, because I genuinely don't know.

      Xander Cesari : The state of Rust right now does feel a bit like the dog that caught the car and now doesn't know what to do with it.

      Jan David Nose : Yeah, I think that's a good analogy. Suddenly we're in a situation where we realize we haven't fully thought through every consequence of success. It's fascinating to see how the challenges change every year. We keep running into new growing pains where something that wasn't an issue a year ago suddenly becomes one because growth keeps going up.

      Jan David Nose : We're constantly rebuilding parts of our infrastructure to keep up with that growth, and I don't see that stopping soon. As a user, that makes me very excited. With the language and the ecosystem growing at this pace, there are going to be very interesting things coming that I can't predict today.

      Jan David Nose : For the project, it also means there are real challenges: financing the infrastructure we need, finding maintainers and contributors, and creating a healthy environment where people can work without burning out. There is a lot of work to be done, but it's an exciting place to be.

      Xander Cesari : Well, thank you for all your work keeping those magic Cargo commands I can type into my terminal just working in the background. If there's any call to action from this interview, it's that if you're a company using Rust, maybe think about donating to keep the Infra Team working.

      Jan David Nose : We always love new Rust Foundation members. Especially if you're a company, that's one of the best ways to support the work we do. Membership gives us a budget we can use either to fund people who work full- time on the project or to fill gaps in our infrastructure sponsorship where we don't get services for free and have to pay real money.

      Jan David Nose : And if you're not a company, we're always looking for people to help out. The Infra Team has a lot of Rust-based bots and other areas where people can contribute relatively easily.

      Xander Cesari : Small scoped bots that you can wrap your head around and help out with.

      Jan David Nose : Exactly. It is a bit harder on the Infra side because we can't give people access to our cloud infrastructure. There are areas where it's simply not possible to contribute as a volunteer because you can't have access to the production systems. But there is still plenty of other work that can be done.

      Jan David Nose : Like every other team in the project, we're a bit short- staffed. So when you're at conferences, come talk to me or Marco. We have work to do.

      Xander Cesari : Well, thank you for doing the work that keeps Rust running.

      Jan David Nose : I'm happy to.

      Xander Cesari : Awesome. Thank you so much.

  2. November 24, 2025
    1. 🔗 IDA Plugin Updates IDA Plugin Updates on 2025-11-24 rss

      IDA Plugin Updates on 2025-11-24

      New Releases:

      Activity:

    2. 🔗 r/LocalLLaMA That's why local models are better rss

      That's why local models are better | That is why the local ones are better than the private ones in addition to this model is still expensive, I will be surprised when the US models reach an optimized price like those in China, the price reflects the optimization of the model, did you know ? submitted by /u/Illustrious-Swim9663
      [link] [comments]
      ---|---

    3. 🔗 r/wiesbaden Schwierigkeiten eine Wohnung in Wiesbaden zu finden – brauche euren Rat rss

      Hey zusammen,

      ich suche seit Monaten eine Wohnung in Wiesbaden (gerne auch am Rand der Stadt) und langsam verzweifle ich ein bisschen. Vielleicht habt ihr Tipps oder kennt alternative Anlaufstellen, auf die ich noch nicht gekommen bin.

      Was ich suche:

      Max. 850 € warm

      Mindestens 2 Zimmer (Wohnzimmer + Schlafzimmer + Küche)

      Parkplatz wäre ideal, ist aber kein Muss

      Lage: Rambach wäre super, aber generell sind die Außenbezirke auch völlig okay. Inzwischen würde ich sogar in die Stadt ziehen, auch ohne Parkplatz, wenn der Rest passt.

      Das Problem:

      Ich nutze ImmoScout (mit Mitgliedschaft), aber alles Gute ist gefühlt sofort weg.

      Ich schaue seit Monaten täglich, aber komme einfach nicht weiter.

      Ich habe das Gefühl, dass ich irgendetwas übersehe oder nicht schnell genug bin.

      Meine Fragen an euch:

      Kennt ihr weitere Plattformen oder Websites speziell für Wiesbaden, die nicht ganz so überlaufen sind?

      Welche Stadtteile würdet ihr in meinem Preisrahmen empfehlen — oder eher meiden?

      Habt ihr Tipps, wie man schneller reagieren oder seine Chancen erhöhen kann?

      Gibt es Wohnungsbaugesellschaften in Wiesbaden, bei denen man sich direkt melden sollte?

      Oder ist mein Budget + Wunschkombination (2 Zimmer + Parkplatz) in Wiesbaden einfach unrealistisch?

      Danke schon mal für jeden Hinweis oder Erfahrungswert. Jede Hilfe ist willkommen! 🙏

      submitted by /u/Timbo_Slicce
      [link] [comments]

    4. 🔗 r/LocalLLaMA Coursera Founder And AI Pioneer Andrew Ng Just Dropped An AI Reviewer That Performs At Human Level rss

      Coursera Founder And AI Pioneer Andrew Ng Just Dropped An AI Reviewer That Performs At Human Level | Andrew Ng just announced a new Agentic Reviewer that gives research feedback approaching human-level performance. It was trained on ICLR 2025 reviews and scored: 0.41 correlation between two human reviewers 0.42 correlation between the AI and a human reviewer Meaning: The AI reviewer is now effectively as reliable as a human reviewer. And it can potentially replace the 6-month feedback loop researchers normally suffer through when submitting papers. It searches arXiv for context, analyzes your paper, and returns structured review comments instantly. For anyone who’s had a paper rejected multiple times and waited months each round… this could be game-changing. Try the tool here: 👉 https://paperreview.ai submitted by /u/AskGpts
      [link] [comments]
      ---|---

    5. 🔗 Simon Willison Claude Opus 4.5, and why evaluating new LLMs is increasingly difficult rss

      Anthropic released Claude Opus 4.5 this morning, which they call "best model in the world for coding, agents, and computer use". This is their attempt to retake the crown for best coding model after significant challenges from OpenAI's GPT-5.1-Codex-Max and Google's Gemini 3, both released within the past week!

      The core characteristics of Opus 4.5 are a 200,000 token context (same as Sonnet), 64,000 token output limit (also the same as Sonnet), and a March 2025 "reliable knowledge cutoff" (Sonnet 4.5 is January, Haiku 4.5 is February).

      The pricing is a big relief: $5/million for input and $25/million for output. This is a lot cheaper than the previous Opus at $15/$75 and keeps it a little more competitive with the GPT-5.1 family ($1.25/$10) and Gemini 3 Pro ($2/$12, or $4/$18 for >200,000 tokens). For comparison, Sonnet 4.5 is $3/$15 and Haiku 4.5 is $1/$5.

      The Key improvements in Opus 4.5 over Opus 4.1 document has a few more interesting details:

      I had access to a preview of Anthropic's new model over the weekend. I spent a bunch of time with it in Claude Code, resulting in a new alpha release of sqlite-utils that included several large-scale refactorings - Opus 4.5 was responsible for most of the work across 20 commits, 39 files changed, 2,022 additions and 1,173 deletions in a two day period. Here's the Claude Code transcript where I had it help implement one of the more complicated new features.

      It's clearly an excellent new model, but I did run into a catch. My preview expired at 8pm on Sunday when I still had a few remaining issues in the milestone for the alpha. I switched back to Claude Sonnet 4.5 and... kept on working at the same pace I'd been achieving with the new model.

      With hindsight, production coding like this is a less effective way of evaluating the strengths of a new model than I had expected.

      I'm not saying the new model isn't an improvement on Sonnet 4.5 - but I can't say with confidence that the challenges I posed it were able to identify a meaningful difference in capabilities between the two.

      This represents a growing problem for me. My favorite moments in AI are when a new model gives me the ability to do something that simply wasn't possible before. In the past these have felt a lot more obvious, but today it's often very difficult to find concrete examples that differentiate the new generation of models from their predecessors.

      Google's Nano Banana Pro image generation model was notable in that its ability to render usable infographics really does represent a task at which previous models had been laughably incapable.

      The frontier LLMs are a lot harder to differentiate between. Benchmarks like SWE-bench Verified show models beating each other by single digit percentage point margins, but what does that actually equate to in real-world problems that I need to solve on a daily basis?

      And honestly, this is mainly on me. I've fallen behind on maintaining my own collection of tasks that are just beyond the capabilities of the frontier models. I used to have a whole bunch of these but they've fallen one-by-one and now I'm embarrassingly lacking in suitable challenges to help evaluate new models.

      I frequently advise people to stash away tasks that models fail at in their notes so they can try them against newer models later on - a tip I picked up from Ethan Mollick. I need to double-down on that advice myself!

      I'd love to see AI labs like Anthropic help address this challenge directly. I'd like to see new model releases accompanied by concrete examples of tasks they can solve that the previous generation of models from the same provider were unable to handle.

      "Here's an example prompt which failed on Sonnet 4.5 but succeeds on Opus 4.5" would excite me a lot more than some single digit percent improvement on a benchmark with a name like MMLU or GPQA Diamond.

      In the meantime, I'm just gonna have to keep on getting them to draw pelicans riding bicycles. Here's Opus 4.5 (on its default "high" effort level):

      The pelican is cute and looks pretty good. The bicycle is not great - the frame is wrong and the pelican is facing backwards when the handlebars appear to be forwards.There is also something that looks a bit like an egg on the handlebars.

      It did significantly better on the new more detailed prompt:

      The pelican has feathers and a red pouch - a close enough version of breeding plumage. The bicycle is a much better shape.

      Here's that same complex prompt against Gemini 3 Pro and against GPT-5.1-Codex-Max-xhigh.

      Still susceptible to prompt injection

      From the safety section of Anthropic's announcement post:

      With Opus 4.5, we’ve made substantial progress in robustness against prompt injection attacks, which smuggle in deceptive instructions to fool the model into harmful behavior. Opus 4.5 is harder to trick with prompt injection than any other frontier model in the industry:

      Bar chart titled "Susceptibility to prompt-injection style attacks" with subtitle "At k queries; lower is better". Y-axis shows "ATTACK SUCCESS RATE (%)" from 0-100. Five stacked bars compare AI models with three k values (k=1 in dark gray, k=10 in beige, k=100 in pink). Results: Gemini 3 Pro Thinking (12.5, 60.7, 92.0), GPT-5.1 Thinking (12.6, 58.2, 87.8), Haiku 4.5 Thinking (8.3, 51.1, 85.6), Sonnet 4.5 Thinking (7.3, 41.9, 72.4), Opus 4.5 Thinking (4.7, 33.6, 63.0).

      On the one hand this looks great, it's a clear improvement over previous models and the competition.

      What does the chart actually tell us though? It tells us that single attempts at prompt injection still work 1/20 times, and if an attacker can try ten different attacks that success rate goes up to 1/3!

      I still don't think training models not to fall for prompt injection is the way forward here. We continue to need to design our applications under the assumption that a suitably motivated attacker will be able to find a way to trick the models.

      You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options.

    6. 🔗 HexRaysSA/plugin-repository commits sync repo: +2 plugins, +2 releases rss
      sync repo: +2 plugins, +2 releases
      
      ## New plugins
      - [parascope](https://github.com/xorpse/parascope) (0.3.0)
      - [unicorn-tracer-arm64](https://github.com/chenxvb/Unicorn-Trace) (0.1)
      
    7. 🔗 3Blue1Brown (YouTube) The most absurd product I've made rss

      Because why not make a pi creature neck pillow? Available at 3b1b.co/store

    8. 🔗 sacha chua :: living an awesome life 2025-11-24 Emacs news rss

      Links from reddit.com/r/emacs, r/orgmode, r/spacemacs, Mastodon #emacs, Bluesky #emacs, Hacker News, lobste.rs, programming.dev, lemmy.world, lemmy.ml, planet.emacslife.com, YouTube, the Emacs NEWS file, Emacs Calendar, and emacs-devel. Thanks to Andrés Ramírez for emacs-devel links. Do you have an Emacs-related link or announcement? Please e-mail me at sacha@sachachua.com. Thank you!

      You can e-mail me at sacha@sachachua.com.

    9. 🔗 @HexRaysSA@infosec.exchange ⬆️ Grow with us! [#hiring](https://infosec.exchange/tags/hiring) mastodon

      ⬆️ Grow with us! #hiring

      Hex-Rays has a handful of open positions in our product, engineering and business development departments. We have offices in Belgium and Romania, with onsite and hybrid opps.

      Start the new year with a new adventure...

      ● Engineering - Back End Developer: https://lnkd.in/gE5uW4Gn

      ● Engineering - Senior Software Engineer:
      https://lnkd.in/gP93WrVQ

      ● Product - Product Owner, IDA Pro:
      https://lnkd.in/g4AJqrFq

      ● Business Dev - Growth Program Manager:
      https://lnkd.in/gv9Hscjy

    10. 🔗 @binaryninja@infosec.exchange Binary Ninja 5.2 adds support for custom string formats and constant mastodon

      Binary Ninja 5.2 adds support for custom string formats and constant encodings. Instead of wrestling with odd or obfuscated values, you can teach Binja how they work and let the analysis reveal the real content anywhere it appears. This update lays the groundwork for more language aware features coming soon. https://binary.ninja/2025/11/13/binary-ninja-5.2-io.html#custom- strings--constants

    11. 🔗 Simon Willison sqlite-utils 4.0a1 has several (minor) backwards incompatible changes rss

      I released a new alpha version of sqlite-utils last night - the 128th release of that package since I started building it back in 2018.

      sqlite-utils is two things in one package: a Python library for conveniently creating and manipulating SQLite databases and a CLI tool for working with them in the terminal. Almost every feature provided by the package is available via both of those surfaces.

      This is hopefully the last alpha before a 4.0 stable release. I use semantic versioning for this library, so the 4.0 version number indicates that there are backward incompatible changes that may affect code written against the 3.x line.

      These changes are mostly very minor: I don't want to break any existing code if I can avoid it. I made it all the way to version 3.38 before I had to ship a major release and I'm sad I couldn't push that even further!

      Here are the annotated release notes for 4.0a1.

      • Breaking change: The db.table(table_name) method now only works with tables. To access a SQL view use db.view(view_name) instead. (#657)

      This change is for type hint enthusiasts. The Python library used to encourage accessing both SQL tables and SQL views through the db["name_of_table_or_view"] syntactic sugar - but tables and view have different interfaces since there's no way to handle a .insert(row) on a SQLite view. If you want clean type hints for your code you can now use the db.table(table_name) and db.view(view_name) methods instead.

      • The table.insert_all() and table.upsert_all() methods can now accept an iterator of lists or tuples as an alternative to dictionaries. The first item should be a list/tuple of column names. See Inserting data from a list or tuple iterator for details. (#672)

      A new feature, not a breaking change. I realized that supporting a stream of lists or tuples as an option for populating large tables would be a neat optimization over always dealing with dictionaries each of which duplicated the column names.

      I had the idea for this one while walking the dog and built the first prototype by prompting Claude Code for web on my phone. Here's the prompt I used and the prototype report it created, which included a benchmark estimating how much of a performance boost could be had for different sizes of tables.

      • Breaking change: The default floating point column type has been changed from FLOAT to REAL, which is the correct SQLite type for floating point values. This affects auto-detected columns when inserting data. (#645)

      I was horrified to discover a while ago that I'd been creating SQLite columns called FLOAT but the correct type to use was REAL! This change fixes that. Previously the fix was to ask for tables to be created in strict mode.

      • Now uses pyproject.toml in place of setup.py for packaging. (#675)

      As part of this I also figured out recipes for using uv as a development environment for the package, which are now baked into the Justfile.

      • Tables in the Python API now do a much better job of remembering the primary key and other schema details from when they were first created. (#655)

      This one is best explained in the issue.

      • Breaking change: The table.convert() and sqlite-utils convert mechanisms no longer skip values that evaluate to False. Previously the --skip-false option was needed, this has been removed. (#542)

      Another change which I would have made earlier but, since it introduces a minor behavior change to an existing feature, I reserved it for the 4.0 release.

      • Breaking change: Tables created by this library now wrap table and column names in "double-quotes" in the schema. Previously they would use [square-braces]. (#677)

      Back in 2018 when I started this project I was new to working in-depth with SQLite and incorrectly concluded that the correct way to create tables and columns named after reserved words was like this:

      create table [my table] (
        [id] integer primary key,
        [key] text
      )
      

      That turned out to be a non-standard SQL syntax which the SQLite documentation describes like this:

      A keyword enclosed in square brackets is an identifier. This is not standard SQL. This quoting mechanism is used by MS Access and SQL Server and is included in SQLite for compatibility.

      Unfortunately I baked it into the library early on and it's been polluting the world with weirdly escaped table and column names ever since!

      I've finally fixed that, with the help of Claude Code which took on the mind-numbing task of updating hundreds of existing tests that asserted against the generated schemas.

      The above example table schema now looks like this:

      create table "my table" (
        "id" integer primary key,
        "key" text
      )
      

      This may seem like a pretty small change but I expect it to cause a fair amount of downstream pain purely in terms of updating tests that work against tables created by sqlite-utils!

      • The --functions CLI argument now accepts a path to a Python file in addition to accepting a string full of Python code. It can also now be specified multiple times. (#659)

      I made this change first in LLM and decided to bring it to sqlite-utils for consistency between the two tools.

      • Breaking change: Type detection is now the default behavior for the insert and upsert CLI commands when importing CSV or TSV data. Previously all columns were treated as TEXT unless the --detect-types flag was passed. Use the new --no-detect-types flag to restore the old behavior. The SQLITE_UTILS_DETECT_TYPES environment variable has been removed. (#679)

      One last minor ugliness that I waited for a major version bump to fix.

      Update: Now that the embargo has lifted I can reveal that a substantial amount of the work on this release was performed using a preview version of Anthropic's new Claude Opus 4.5 model. Here's the Claude Code transcript for the work to implement the ability to use an iterator over lists instead of dictionaries for bulk insert and upsert operations.

      You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options.

    12. 🔗 Anton Zhiyanov Gist of Go: Concurrency testing rss

      This is a chapter from my book onGo concurrency, which teaches the topic from the ground up through interactive examples.

      Testing concurrent programs is a lot like testing single-task programs. If the code is well-designed, you can test the state of a concurrent program with standard tools like channels, wait groups, and other abstractions built on top of them.

      But if you've made it so far, you know that concurrency is never that easy. In this chapter, we'll go over common testing problems and the solutions that Go offers.

      Waiting for goroutinesChecking channelsChecking for leaksDurable blockingInstant waitingTime inside the bubbleThoughts on time 1 ✎ • Thoughts on time 2 ✎ • Checking for cleanupBubble rulesKeep it up

      Waiting for goroutines to finish

      Let's say we want to test this function:

      // Calc calculates something asynchronously.
      func Calc() <-chan int {
          out := make(chan int, 1)
          go func() {
              out <- 42
          }()
          return out
      }
      

      Calculations run asynchronously in a separate goroutine. However, the function returns a result channel, so this isn't a problem:

      func Test(t *testing.T) {
          got := <-Calc() // (X)
          if got != 42 {
              t.Errorf("got: %v; want: 42", got)
          }
      }
      
      
      
      PASS
      

      At point ⓧ, the test is guaranteed to wait for the inner goroutine to finish. The rest of the test code doesn't need to know anything about how concurrency works inside the Calc function. Overall, the test isn't any more complicated than if Calc were synchronous.

      But we're lucky that Calc returns a channel. What if it doesn't?

      Naive approach

      Let's say the Calc function looks like this:

      var state atomic.Int32
      
      // Calc calculates something asynchronously.
      func Calc() {
          go func() {
              state.Store(42)
          }()
      }
      

      We write a simple test and run it:

      func TestNaive(t *testing.T) {
          Calc()
          got := state.Load() // (X)
          if got != 42 {
              t.Errorf("got: %v; want: 42", got)
          }
      }
      
      
      
      === RUN   TestNaive
          main_test.go:27: got: 0; want: 42
      --- FAIL: TestNaive (0.00s)
      

      The assertion fails because at point ⓧ, we didn't wait for the inner Calc goroutine to finish. In other words, we didn't synchronize the TestNaive and Calc goroutines. That's why state still has its initial value (0) when we do the check.

      Waiting with time.Sleep

      We can add a short delay with time.Sleep:

      func TestSleep(t *testing.T) {
          Calc()
      
          // Wait for the goroutine to finish (if we're lucky).
          time.Sleep(50 * time.Millisecond)
      
          got := state.Load()
          if got != 42 {
              t.Errorf("got: %v; want: 42", got)
          }
      }
      
      
      
      === RUN   TestSleep
      --- PASS: TestSleep (0.05s)
      

      The test is now passing. But using time.Sleep to sync goroutines isn't a great idea, even in tests. We don't want to set a custom delay for every function we're testing. Also, the function's execution time may be different on the local machine compared to a CI server. If we use a longer delay just to be safe, the tests will end up taking too long to run.

      Sometimes you can't avoid using time.Sleep in tests, but since Go 1.25, the synctest package has made these cases much less common. Let's see how it works.

      Waiting with synctest

      The synctest package has a lot going on under the hood, but its public API is very simple:

      func Test(t *testing.T, f func(*testing.T))
      func Wait()
      

      The synctest.Test function creates an isolated bubble where you can control time to some extent. Any new goroutines started inside this bubble become part of the bubble. So, if we wrap the test code with synctest.Test, everything will run inside the bubble — the test code, the Calc function we're testing, and its goroutine.

      func TestSync(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              Calc()
      
              // (X)
      
              got := state.Load()
              if got != 42 {
                  t.Errorf("got: %v; want: 42", got)
              }
          })
      }
      

      At point ⓧ, we want to wait for the Calc goroutine to finish. The synctest.Wait function comes to the rescue! It blocks the calling goroutine until all other goroutines in the bubble are finished. (It's actually a bit more complicated than that, but we'll talk about it later.)

      In our case, there's only one other goroutine (the inner Calc goroutine), so Wait will pause until it finishes, and then the test will move on.

      func TestSync(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              Calc()
      
              // Wait for the goroutine to finish.
              synctest.Wait()
      
              got := state.Load()
              if got != 42 {
                  t.Errorf("got: %v; want: 42", got)
              }
          })
      }
      
      
      
      === RUN   TestSync
      --- PASS: TestSync (0.00s)
      

      Now the test passes instantly. That's better!

      ✎ Exercise: Wait until done

      Practice is crucial in turning abstract knowledge into skills, making theory alone insufficient. The full version of the book contains a lot of exercises — that's why I recommend getting it.

      If you are okay with just theory for now, let's continue.

      Checking the channel state

      As we've seen, you can use synctest.Wait to wait for the tested goroutine to finish, and then check the state of the data you are interested in. You can also use it to check the state of channels.

      Let's say there's a function that generates N numbers like 11, 22, 33, and so on:

      // Generate produces n numbers like 11, 22, 33, ...
      func Generate(n int) <-chan int {
          out := make(chan int)
          go func() {
              for i := range n {
                  out <- (i+1)*10 + (i + 1)
              }
          }()
          return out
      }
      

      And a simple test:

      func Test(t *testing.T) {
          out := Generate(2)
          var got int
      
          got = <-out
          if got != 11 {
              t.Errorf("#1: got %v, want 11", got)
          }
          got = <-out
          if got != 22 {
              t.Errorf("#1: got %v, want 22", got)
          }
      }
      
      
      
      PASS
      

      Set N=2, get the first number from the generator's output channel, then get the second number. The test passed, so the function works correctly. But does it really?

      Let's use Generate in "production":

      func main() {
          for v := range Generate(3) {
              fmt.Print(v, " ")
          }
      }
      
      
      
      11 22 33 fatal error: all goroutines are asleep - deadlock!
      

      Panic! We forgot to close the out channel when exiting the inner Generate goroutine, so the for-range loop waiting on that channel got stuck.

      Let's fix the code:

      // Generate produces n numbers like 11, 22, 33, ...
      func Generate(n int) <-chan int {
          out := make(chan int)
          go func() {
              defer close(out)
              for i := range n {
                  out <- (i+1)*10 + (i + 1)
              }
          }()
          return out
      }
      

      And add a test for the out channel state:

      func Test(t *testing.T) {
          out := Generate(2)
          <-out // 11
          <-out // 22
      
          // (X)
      
          // Check that the channel is closed.
          select {
          case _, ok := <-out:
              if ok {
                  t.Errorf("expected channel to be closed")
              }
          default:
              t.Errorf("expected channel to be closed")
          }
      }
      
      
      
      --- FAIL: Test (0.00s)
          main_test.go:41: expected channel to be closed
      

      The test is still failing, even though we're now closing the channel when the Generate goroutine exits.

      This is a familiar problem: at point ⓧ, we didn't wait for the inner Generate goroutine to finish. So when we check the out channel, it hasn't closed yet. That's why the test fails.

      We can delay the check using time.After:

      func Test(t *testing.T) {
          out := Generate(2)
          <-out
          <-out
      
          // Check that the channel is closed.
          select {
          case _, ok := <-out:
              if ok {
                  t.Errorf("expected channel to be closed")
              }
          case <-time.After(50 * time.Millisecond):
              t.Fatalf("timeout waiting for channel to close")
          }
      }
      
      
      
      PASS
      

      But it's better to use synctest:

      func TestClose(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              out := Generate(2)
              <-out
              <-out
      
              // Wait for the goroutine to finish.
              synctest.Wait()
      
              // Check that the channel is closed.
              select {
              case _, ok := <-out:
                  if ok {
                      t.Errorf("expected channel to be closed")
                  }
              default:
                  t.Errorf("expected channel to be closed")
              }
          })
      }
      
      
      
      PASS
      

      At point ⓧ, synctest.Wait blocks the test until the only other goroutine (the inner Generate goroutine) finishes. Once the goroutine has exited, the channel is already closed. So, in the select statement, the <-out case triggers with ok set to false, allowing the test to pass.

      As you can see, the synctest package helped us avoid delays in the test, and the test itself didn't get much more complicated.

      Checking for goroutine leaks

      As we've seen, you can use synctest.Wait to wait for the tested goroutine to finish, and then check the state of the data or channels. You can also use it to detect goroutine leaks.

      Let's say there's a function that runs the given functions concurrently and sends their results to an output channel:

      // Map runs the given functions concurently.
      func Map(funcs ...func() int) <-chan int {
          out := make(chan int)
          for _, f := range funcs {
              go func() {
                  out <- f()
              }()
          }
          return out
      }
      

      And a simple test:

      func Test(t *testing.T) {
          out := Map(
              func() int { return 11 },
              func() int { return 22 },
              func() int { return 33 },
          )
      
          got := <-out
          if got != 11 && got != 22 && got != 33 {
              t.Errorf("got %v, want 11, 22 or 33", got)
          }
      }
      
      
      
      PASS
      

      Send three functions to be executed, get the first result from the output channel, and check it. The test passed, so the function works correctly. But does it really?

      Let's run Map three times, passing three functions each time:

      func main() {
          for range 3 {
              Map(
                  func() int { return 11 },
                  func() int { return 22 },
                  func() int { return 33 },
              )
          }
      
          time.Sleep(50 * time.Millisecond)
          nGoro := runtime.NumGoroutine() - 1 // minus the main goroutine
          fmt.Println("nGoro =", nGoro)
      }
      
      
      
      nGoro = 9
      

      After 50 ms — when all the functions should definitely have finished — there are still 9 running goroutines (runtime.NumGoroutine). In other words, all the goroutines are stuck.

      The reason is that the out channel is unbuffered. If the client doesn't read from it, or doesn't read all the results, the goroutines inside Map get blocked when they try to send the result of f() to out.

      Let's fix this by adding a buffer of the right size to the channel:

      // Map runs the given functions concurently.
      func Map(funcs ...func() int) <-chan int {
          out := make(chan int, len(funcs))
          for _, f := range funcs {
              go func() {
                  out <- f()
              }()
          }
          return out
      }
      

      Then add a test to check the number of goroutines:

      func Test(t *testing.T) {
          for range 3 {
              Map(
                  func() int { return 11 },
                  func() int { return 22 },
                  func() int { return 33 },
              )
          }
      
          // (X)
      
          nGoro := runtime.NumGoroutine() - 2 // minus the main and Test goroutines
      
          if nGoro != 0 {
              t.Fatalf("expected 0 goroutines, got %d", nGoro)
          }
      }
      
      
      
      --- FAIL: Test (0.00s)
          main_test.go:44: expected 0 goroutines, got 9
      

      The test is still failing, even though the channel is now buffered, and the goroutines shouldn't block on sending to it.

      This is a familiar problem: at point ⓧ, we didn't wait for the running Map goroutines to finish. So nGoro is greater than zero, which makes the test fail.

      We can delay the check using time.Sleep (not recommended), or use a third- party package like goleak (a better option):

      func Test(t *testing.T) {
          defer goleak.VerifyNone(t)
      
          for range 3 {
              Map(
                  func() int { return 11 },
                  func() int { return 22 },
                  func() int { return 33 },
              )
          }
      }
      
      
      
      PASS
      

      The test passes now.

      By the way, goleak also uses time.Sleep internally, but it does so much more efficiently. It tries up to 20 times, with the wait time between checks increasing exponentially, starting at 1 microsecond and going up to 100 milliseconds. This way, the test runs almost instantly.

      Even better, we can check for leaks without any third-party packages by using synctest:

      func Test(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              for range 3 {
                  Map(
                      func() int { return 11 },
                      func() int { return 22 },
                      func() int { return 33 },
                  )
              }
              synctest.Wait()
          })
      }
      
      
      
      PASS
      

      Earlier, I said that synctest.Wait blocks the calling goroutine until all other goroutines finish. Actually, it's a bit more complicated. synctest.Wait blocks until all other goroutines either finish or become durably blocked.

      We'll talk about "durably" later. For now, let's focus on "become blocked." Let's temporarily remove the buffer from the channel and check the test results:

      // Map runs the given functions concurently.
      func Map(funcs ...func() int) <-chan int {
          out := make(chan int)
          for _, f := range funcs {
              go func() {
                  out <- f()
              }()
          }
          return out
      }
      
      func Test(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              for range 3 {
                  Map(
                      func() int { return 11 },
                      func() int { return 22 },
                      func() int { return 33 },
                  )
              }
              synctest.Wait()
          })
      }
      
      
      
      --- FAIL: Test (0.00s)
      panic: deadlock: main bubble goroutine has exited but blocked goroutines remain [recovered, repanicked]
      

      Here's what happens:

      1. Three calls to Map start 9 goroutines.
      2. The call to synctest.Wait blocks the root bubble goroutine (synctest.Test).
      3. One of the goroutines finishes its work, tries to write to out, and gets blocked (because no one is reading from out).
      4. The same thing happens to the other 8 goroutines.
      5. synctest.Wait sees that all the child goroutines in the bubble are blocked, so it unblocks the root goroutine.
      6. The root goroutine finishes.

      Next, synctest.Test comes into play. It not only starts the bubble goroutine, but also tries to wait for all child goroutines to finish before it returns. If Test sees that some goroutines are stuck (in our case, all 9 are blocked trying to send to the channel), it panics:

      main bubble goroutine has exited but blocked goroutines remain

      So, we found the leak without using time.Sleep or goleak, thanks to the useful features of synctest.Wait and synctest.Test:

      • synctest.Wait unblocks as soon as all other goroutines are durably blocked.
      • synctest.Test panics when finished if there are still blocked goroutines left in the bubble.

      Now let's make the channel buffered and run the test again:

      === RUN   Test
      --- PASS: Test (0.00s)
      

      Perfect!

      Durable blocking

      As we've found, synctest.Wait blocks until all goroutines in the bubble — except the one that called Wait — have either finished or are durably blocked. Let's figure out what "durably blocked" means.

      For synctest, a goroutine inside a bubble is considered durably blocked if it is blocked by any of the following operations:

      • Sending to or receiving from a channel created within the bubble.
      • A select statement where every case is a channel created within the bubble.
      • Calling WaitGroup.Wait if all WaitGroup.Add calls were made inside the bubble.
      • Calling Cond.Wait.
      • Calling time.Sleep.

      Other blocking operations are not considered durable, and synctest.Wait ignores them. For example:

      • Sending to or receiving from a channel created outside the bubble.
      • Calling Mutex.Lock or RWMutex.Lock.
      • I/O operations (like reading a file from disk or waiting for a network response).
      • System calls and cgo calls.

      The distinction between "durable" and other types of blocks is just a implementation detail of the synctest package. It's not a fundamental property of the blocking operations themselves. In real-world applications, this distinction doesn't exist, and "durable" blocks are neither better nor worse than any others.

      Let's look at an example.

      Asynchronous processor

      Let's say there's a Proc type that performs some asynchronous computation:

      // Proc calculates something asynchronously.
      type Proc struct {
          // ...
      }
      
      // NewProc starts the calculation in a separate goroutine.
      // The calculation keep running until Stop is called.
      func NewProc() *Proc
      
      // Res returns the current calculation result.
      // It's only available until Stop is called; after that, it resets to zero.
      func (p *Proc) Res() int
      
      // Stop terminates the calculation.
      func (p *Proc) Stop()
      

      Our goal is to write a test that checks the result while the calculation is still running. Let's see how the test changes depending on how Proc is implemented (except for the time.Sleep version — we'll cover that one a bit later).

      Blocking on a channel

      Let's say Proc is implemented using a done channel:

      // Proc calculates something asynchronously.
      type Proc struct {
          res  int
          done chan struct{}
      }
      
      // NewProc starts the calculation.
      func NewProc() *Proc {
          p := &Proc{done: make(chan struct{})}
          go func() {
              p.res = 42
              <-p.done // (X)
              p.res = 0
          }()
          return p
      }
      
      // Stop terminates the calculation.
      func (p *Proc) Stop() {
          close(p.done)
      }
      

      Naive test:

      func TestNaive(t *testing.T) {
          p := NewProc()
          defer p.Stop()
      
          if got := p.Res(); got != 42 {
              t.Fatalf("got %v, want 42", got)
          }
      }
      
      
      
      --- FAIL: TestNaive (0.00s)
          main_test.go:52: got 0, want 42
      

      The check fails because when p.Res() is called, the goroutine in NewProc hasn't set p.res = 42 yet.

      Let's use synctest.Wait to wait until the goroutine is blocked at point ⓧ:

      func TestSync(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              p := NewProc()
              defer p.Stop()
      
              // Wait for the goroutine to block at point X.
              synctest.Wait()
              if got := p.Res(); got != 42 {
                  t.Fatalf("got %v, want 42", got)
              }
          })
      }
      
      
      
      PASS
      

      In ⓧ, the goroutine is blocked on reading from the p.done channel. This channel is created inside the bubble, so the block is durable. The synctest.Wait call in the test returns as soon as <-p.done happens, and we get the current value of p.res.

      Blocking on a select

      Let's say Proc is implemented using select:

      // Proc calculates something asynchronously.
      type Proc struct {
          res  int
          in   chan int
          done chan struct{}
      }
      
      // NewProc starts the calculation.
      func NewProc() *Proc {
          p := &Proc{
              res:  0,
              in:   make(chan int),
              done: make(chan struct{}),
          }
          go func() {
              p.res = 42
              select { // (X)
              case n := <-p.in:
                  p.res = n
              case <-p.done:
              }
          }()
          return p
      }
      
      // Stop terminates the calculation.
      func (p *Proc) Stop() {
          close(p.done)
      }
      

      Let's use synctest.Wait to wait until the goroutine is blocked at point ⓧ:

      func TestSync(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              p := NewProc()
              defer p.Stop()
      
              // Wait for the goroutine to block at point X.
              synctest.Wait()
              if got := p.Res(); got != 42 {
                  t.Fatalf("got %v, want 42", got)
              }
          })
      }
      
      
      
      PASS
      

      In ⓧ, the goroutine is blocked on a select statement. Both channels used in the select (p.in and p.done) are created inside the bubble, so the block is durable. The synctest.Wait call in the test returns as soon as select happens, and we get the current value of p.res.

      Blocking on a wait group

      Let's say Proc is implemented using a wait group:

      // Proc calculates something asynchronously.
      type Proc struct {
          res int
          wg  sync.WaitGroup
      }
      
      // NewProc starts the calculation.
      func NewProc() *Proc {
          p := &Proc{}
          p.wg.Add(1)
          go func() {
              p.res = 42
              p.wg.Wait() // (X)
              p.res = 0
          }()
          return p
      }
      
      // Stop terminates the calculation.
      func (p *Proc) Stop() {
          p.wg.Done()
      }
      

      Let's use synctest.Wait to wait until the goroutine is blocked at point ⓧ:

      func TestSync(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              p := NewProc()
              defer p.Stop()
      
              // Wait for the goroutine to block at point X.
              synctest.Wait()
              if got := p.Res(); got != 42 {
                  t.Fatalf("got %v, want 42", got)
              }
          })
      }
      
      
      
      PASS
      

      In ⓧ, the goroutine is blocked on the wait group's p.wg.Wait() call. The group's Add method was called inside the bubble, so this is a durable block. The synctest.Wait call in the test returns as soon as p.wg.Wait() happens, and we get the current value of p.res.

      Blocking on a condition variable

      Let's say Proc is implemented using a condition variable:

      // Proc calculates something asynchronously.
      type Proc struct {
          res  int
          cond *sync.Cond
      }
      
      // NewProc starts the calculation.
      func NewProc() *Proc {
          p := &Proc{
              cond: sync.NewCond(&sync.Mutex{}),
          }
          go func() {
              p.cond.L.Lock()
              p.res = 42
              p.cond.Wait() // (X)
              p.res = 0
              p.cond.L.Unlock()
          }()
          return p
      }
      
      // Stop terminates the calculation.
      func (p *Proc) Stop() {
          p.cond.Signal()
      }
      

      Let's use synctest.Wait to wait until the goroutine is blocked at point ⓧ:

      func TestSync(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              p := NewProc()
              defer p.Stop()
      
              // Wait for the goroutine to block at point X.
              synctest.Wait()
              if got := p.Res(); got != 42 {
                  t.Fatalf("got %v, want 42", got)
              }
          })
      }
      
      
      
      PASS
      

      In ⓧ, the goroutine is blocked on the condition variable's p.cond.Wait() call. This is a durable block. The synctest.Wait call returns as soon as p.cond.Wait() happens, and we get the current value of p.res.

      Blocking on a mutex

      Let's say Proc is implemented using a mutex:

      // Proc calculates something asynchronously.
      type Proc struct {
          res int
          mu  sync.Mutex
      }
      
      // NewProc starts the calculation.
      func NewProc() *Proc {
          p := &Proc{}
          p.mu.Lock()
          go func() {
              p.res = 42
              p.mu.Lock() // (X)
              p.res = 0
              p.mu.Unlock()
          }()
          return p
      }
      
      // Stop terminates the calculation.
      func (p *Proc) Stop() {
          p.mu.Unlock()
      }
      

      Let's try using synctest.Wait to wait until the goroutine is blocked at point ⓧ:

      func TestSync(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              p := NewProc()
              defer p.Stop()
      
              // Hangs because synctest ignores blocking on a mutex.
              synctest.Wait()
              if got := p.Res(); got != 42 {
                  t.Fatalf("got %v, want 42", got)
              }
          })
      }
      
      
      
      code execution timeout
      

      In ⓧ, the goroutine is blocked on the mutex's p.mu.Lock() call. synctest doesn't consider blocking on a mutex to be durable. The synctest.Wait call ignores the block and never returns. The test hangs and only fails when the overall go test timeout is reached.

      You might be wondering why the synctest authors didn't consider blocking on mutexes to be durable. There are a couple of reasons:

      1. Mutexes are usually used to protect shared state, not to coordinate goroutines (the example above is completely unrealistic). In tests, you usually don't need to pause before locking a mutex to check something.
      2. Mutex locks are usually held for a very short time, and mutexes themselves need to be as fast as possible. Adding extra logic to support synctest could slow them down in normal (non-test) situations.

      ⌘ ⌘ ⌘

      Let's go back to the original question: how does the test change depending on how Proc is implemented? It doesn't change at all. We used the exact same test code every time:

      func TestSync(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              p := NewProc()
              defer p.Stop()
      
              synctest.Wait()
              if got := p.Res(); got != 42 {
                  t.Fatalf("got %v, want 42", got)
              }
          })
      }
      

      If your program uses durably blocking operations, synctest.Wait always works the same way:

      1. It waits until all other goroutines in the bubble are blocked.
      2. Then, it unblocks the goroutine that called it.

      Very convenient!

      ✎ Exercise: Blocking queue

      Practice is crucial in turning abstract knowledge into skills, making theory alone insufficient. The full version of the book contains a lot of exercises — that's why I recommend getting it.

      If you are okay with just theory for now, let's continue.

      Instant waiting

      Inside the synctest.Test bubble, time works differently. Instead of using a regular wall clock, the bubble uses a fake clock that can jump forward to any point in the future. This can be quite handy when testing time-sensitive code.

      Let's say we want to test this function:

      // Calc processes a value from the input channel.
      // Times out if no input is received after 3 seconds.
      func Calc(in chan int) (int, error) {
          select {
          case v := <-in:
              return v * 2, nil
          case <-time.After(3 * time.Second):
              return 0, ErrTimeout
          }
      }
      

      The positive scenario is straightforward: send a value to the channel, call the function, and check the result:

      func TestCalc_result(t *testing.T) {
          ch := make(chan int)
          go func() { ch <- 11 }()
          got, err := Calc(ch)
      
          if err != nil {
              t.Fatalf("unexpected error: %v", err)
          }
          if got != 22 {
              t.Errorf("got: %v; want: 22", got)
          }
      }
      
      
      
      PASS
      

      The negative scenario, where the function times out, is also pretty straightforward. But the test takes the full three seconds to complete:

      func TestCalc_timeout_naive(t *testing.T) {
          ch := make(chan int)
          got, err := Calc(ch) // runs for 3 seconds
      
          if err != ErrTimeout {
              t.Errorf("got: %v; want: %v", err, ErrTimeout)
          }
          if got != 0 {
              t.Errorf("got: %v; want: 0", got)
          }
      }
      
      
      
      === RUN   TestCalc_timeout_naive
      --- PASS: TestCalc_timeout_naive (3.00s)
      

      We're actually lucky the timeout is only three seconds. It could have been as long as sixty!

      To make the test run instantly, let's wrap it in synctest.Test:

      func TestCalc_timeout_synctest(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              ch := make(chan int)
              got, err := Calc(ch) // runs instantly
      
              if err != ErrTimeout {
                  t.Errorf("got: %v; want: %v", err, ErrTimeout)
              }
              if got != 0 {
                  t.Errorf("got: %v; want: 0", got)
              }
          })
      }
      
      
      
      === RUN   TestCalc_timeout_synctest
      --- PASS: TestCalc_timeout_synctest (0.00s)
      

      Note that there is no synctest.Wait call here, and the only goroutine in the bubble (the root one) gets durably blocked on a select statement in Calc. Here's what happens next:

      1. The bubble checks if the goroutine can be unblocked by waiting. In our case, it can — we just need to wait 3 seconds.
      2. The bubble's clock instantly jumps forward 3 seconds.
      3. The select in Calc chooses the timeout case, and the function returns ErrTimeout.
      4. The test assertions for err and got both pass successfully.

      Thanks to the fake clock, the test runs instantly instead of taking three seconds like it would with the "naive" approach.

      You might have noticed that quite a few circumstances coincided here:

      • There's no synctest.Wait call.
      • There's only one goroutine.
      • The goroutine is durably blocked.
      • It will be unblocked at certain point in the future.

      We'll look at the alternatives soon, but first, here's a quick exercise.

      ✎ Exercise: Wait, repeat

      Practice is crucial in turning abstract knowledge into skills, making theory alone insufficient. The full version of the book contains a lot of exercises — that's why I recommend getting it.

      If you are okay with just theory for now, let's continue.

      Time inside the bubble

      The fake clock in synctest.Test can be tricky. It move forward only if: ➊ all goroutines in the bubble are durably blocked; ➋ there's a future moment when at least one goroutine will unblock; and ➌ synctest.Wait isn't running.

      Let's look at the alternatives. I'll say right away, this isn't an easy topic. But when has time travel ever been easy? :)

      Not all goroutines are blocked

      Here's the Calc function we're testing:

      // Calc processes a value from the input channel.
      // Times out if no input is received after 3 seconds.
      func Calc(in chan int) (int, error) {
          select {
          case v := <-in:
              return v * 2, nil
          case <-time.After(3 * time.Second):
              return 0, ErrTimeout
          }
      }
      

      Let's run Calc in a separate goroutine, so there will be two goroutines in the bubble:

      func Test(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              var got int
              var err error
      
              go func() {
                  ch := make(chan int)
                  got, err = Calc(ch)
              }()
      
              if err != ErrTimeout {
                  t.Errorf("got: %v; want: %v", err, ErrTimeout)
              }
              if got != 0 {
                  t.Errorf("got: %v; want: 0", got)
              }
          })
      }
      
      
      
      --- FAIL: Test (0.00s)
          main_test.go:45: got: <nil>; want: timeout
      panic: deadlock: main bubble goroutine has exited but blocked goroutines remain [recovered, repanicked]
      

      synctest.Test panicked because the root bubble goroutine finished while the Calc goroutine was still blocked on a select.

      Reason: synctest.Test only advances the clock if all goroutines are blocked — including the root bubble goroutine.

      How to fix: Use time.Sleep to make sure the root goroutine is also durably blocked.

      func Test_fixed(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              ch := make(chan int)
              var got int
      
              go func() {
                  got, _ = Calc(ch)
              }()
      
              // Wait for the Calc goroutine to finish.
              time.Sleep(5 * time.Second)
      
              if got != 0 {
                  t.Errorf("got: %v; want: 0", got)
              }
          })
      }
      
      
      
      PASS
      

      Now all three conditions are met again (all goroutines are durably blocked; the moment of future unblocking is known; there is no call to synctest.Wait). The fake clock moves forward 3 seconds, which unblocks the Calc goroutine. The goroutine finishes, leaving only the root one, which is still blocked on time.Sleep. The clock moves forward another 2 seconds, unblocking the root goroutine. The assertion passes, and the test completes successfully.

      But if we run the test with the race detector enabled (using the -race flag), it reports a data race on the got variable:

      race detected during execution of test
      

      Logically, using time.Sleep in the root goroutine doesn't guarantee that the Calc goroutine (which writes to the got variable) will finish before the root goroutine reads from got. That's why the race detector reports a problem. Technically, the test passes because of how synctest is implemented, but the race still exists in the code. The right way to handle this is to call synctest.Wait after time.Sleep:

      func Test_fixed(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              ch := make(chan int)
              var got int
      
              go func() {
                  got, _ = Calc(ch)
              }()
      
              // Wait for the Calc goroutine to finish.
              time.Sleep(3 * time.Second)
              synctest.Wait()
      
              if got != 0 {
                  t.Errorf("got: %v; want: 0", got)
              }
          })
      }
      
      
      
      PASS
      

      Calling synctest.Wait ensures that the Calc goroutine finishes before the root goroutine reads got, so there's no data race anymore.

      synctest.Wait is running

      Here's the Calc function we're testing:

      // Calc processes a value from the input channel.
      // Times out if no input is received after 3 seconds.
      func Calc(in chan int) (int, error) {
          select {
          case v := <-in:
              return v * 2, nil
          case <-time.After(3 * time.Second):
              return 0, ErrTimeout
          }
      }
      

      Let's replace time.Sleep() in the root goroutine with synctest.Wait():

      func Test(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              var got int
              var err error
      
              go func() {
                  ch := make(chan int)
                  got, err = Calc(ch)
              }()
      
              // Doesn't wait for the Calc goroutine to finish.
              synctest.Wait()
      
              if err != ErrTimeout {
                  t.Errorf("got: %v; want: %v", err, ErrTimeout)
              }
              if got != 0 {
                  t.Errorf("got: %v; want: 0", got)
              }
          })
      }
      
      
      
      --- FAIL: Test (0.00s)
          main_test.go:48: got: <nil>; want: timeout
      panic: deadlock: main bubble goroutine has exited but blocked goroutines remain [recovered, repanicked]
      

      synctest.Test panicked because the root bubble goroutine finished while the Calc goroutine was still blocked on a select.

      Reason: synctest.Test only advances the clock if there is no active synctest.Wait running.

      If all bubble goroutines are durably blocked but a synctest.Wait is running, synctest.Test won't advance the clock. Instead, it will simply finish the synctest.Wait call and return control to the goroutine that called it (in this case, the root bubble goroutine).

      How to fix: don't use synctest.Wait.

      The moment of unblocking is unclear

      Let's update Calc to use context cancellation instead of a timer:

      // Calc processes a value from the input channel.
      // Exits if the context is canceled.
      func Calc(in chan int, ctx context.Context) (int, error) {
          select {
          case v := <-in:
              return v * 2, nil
          case <-ctx.Done():
              return 0, ctx.Err()
          }
      }
      

      We won't cancel the context in the test:

      func Test(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              ch := make(chan int)
              ctx, _ := context.WithCancel(context.Background())
              got, err := Calc(ch, ctx)
      
              if err != nil {
                  t.Errorf("got: %v; want: nil", err)
              }
              if got != 0 {
                  t.Errorf("got: %v; want: 0", got)
              }
          })
      }
      
      
      
      --- FAIL: Test (0.00s)
      panic: deadlock: all goroutines in bubble are blocked [recovered, repanicked]
      

      synctest.Test panicked because all goroutines in the bubble are hopelessly blocked.

      Reason: synctest.Test only advances the clock if it knows how much to advance it. In this case, there is no future moment that would unblock the select in Calc.

      How to fix: Manually unblock the goroutine and call synctest.Wait to wait for it to finish.

      func Test_fixed(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              var got int
              var err error
              ctx, cancel := context.WithCancel(context.Background())
      
              go func() {
                  ch := make(chan int)
                  got, err = Calc(ch, ctx)
              }()
      
              // Unblock the Calc goroutine.
              cancel()
              // Wait for it to finish.
              synctest.Wait()
      
              if err != context.Canceled {
                  t.Errorf("got: %v; want: %v", err, context.Canceled)
              }
              if got != 0 {
                  t.Errorf("got: %v; want: 0", got)
              }
          })
      }
      
      
      
      PASS
      

      Now, cancel() cancels the context and unblocks the select in Calc, while synctest.Wait makes sure the Calc goroutine finishes before the test checks got and err.

      The goroutine isn't durably blocked

      Let's update Calc to lock the mutex before doing any calculations:

      // Calc processes a value and returns the result.
      func Calc(v int, mu *sync.Mutex) int {
          mu.Lock()
          defer mu.Unlock()
          v = v * 2
          return v
      }
      

      In the test, we'll lock the mutex before calling Calc, so it will block:

      func Test(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              var mu sync.Mutex
              mu.Lock()
      
              go func() {
                  time.Sleep(10 * time.Millisecond)
                  mu.Unlock()
              }()
      
              got := Calc(11, &mu)
      
              if got != 22 {
                  t.Errorf("got: %v; want: 22", got)
              }
          })
      }
      
      
      
      code execution timeout
      

      The test failed because it hit the overall timeout set in go test.

      Reason: synctest.Test only works with durable blocks. Blocking on a mutex lock isn't considered durable, so the bubble can't do anything about it — even though the sleeping inner goroutine would have unlocked the mutex in 10 ms if the bubble had used the wall clock.

      How to fix: Don't use synctest.

      func Test_fixed(t *testing.T) {
          var mu sync.Mutex
          mu.Lock()
      
          go func() {
              time.Sleep(10 * time.Millisecond)
              mu.Unlock()
          }()
      
          got := Calc(11, &mu)
      
          if got != 22 {
              t.Errorf("got: %v; want: 22", got)
          }
      }
      
      
      
      PASS
      

      Now the mutex unlocks after 10 milliseconds (wall clock), Calc finishes successfully, and the got check passes.

      Summary

      The clock inside the buuble won't move forward if:

      • There are any goroutines that aren't durably blocked.
      • It's unclear how much time to advance.
      • synctest.Wait is running.

      Phew.

      ✎ Exercise: Asynchronous repeater

      Practice is crucial in turning abstract knowledge into skills, making theory alone insufficient. The full version of the book contains a lot of exercises — that's why I recommend getting it.

      If you are okay with just theory for now, let's continue.

      ✎ Thoughts on time 1

      Let's practice understanding time in the bubble with some thinking exercises. Try to solve the problem in your head before using the playground.

      Here's a function that performs synchronous work:

      var done atomic.Bool
      
      // workSync performs synchronous work.
      func workSync() {
          time.Sleep(3 * time.Second)
          done.Store(true)
      }
      

      And a test for it:

      func TestWorkSync(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              workSync()
      
              // (X)
      
              if !done.Load() {
                  t.Errorf("work not done")
              }
          })
      }
      

      What is the test missing at point ⓧ?

      1. synctest.Wait()
      2. time.Sleep(3 * time.Second)
      3. synctest.Wait, then time.Sleep
      4. time.Sleep, then synctest.Wait
      5. Nothing.

        func TestWorkSync(t testing.T) { synctest.Test(t, func(t testing.T) { workSync()

            // (X)
        
            if !done.Load() {
                t.Errorf("work not done")
            }
        })
        

        }

      ✓ Thoughts on time 1

      There's only one goroutine in the test, so when workSync gets blocked by time.Sleep, the time in the bubble jumps forward by 3 seconds. Then workSync sets done to true and finishes. Finally, the test checks done and passes successfully.

      No need to add anything.

      ✎ Thoughts on time 2

      Let's keep practicing our understanding of time in the bubble with some thinking exercises. Try to solve the problem in your head before using the playground.

      Here's a function that performs asynchronous work:

      var done atomic.Bool
      
      // workAsync performs asynchronous work.
      func workAsync() {
          go func() {
              time.Sleep(3 * time.Second)
              done.Store(true)
          }()
      }
      

      And a test for it:

      func TestWorkAsync(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              workAsync()
      
              // (X)
      
              if !done.Load() {
                  t.Errorf("work not done")
              }
          })
      }
      

      What is the test missing at point ⓧ?

      1. synctest.Wait()
      2. time.Sleep(3 * time.Second)
      3. synctest.Wait, then time.Sleep
      4. time.Sleep, then synctest.Wait
      5. Nothing.

        func TestWorkAsync(t testing.T) { synctest.Test(t, func(t testing.T) { workAsync()

            // (X)
        
            if !done.Load() {
                t.Errorf("work not done")
            }
        })
        

        }

      ✓ Thoughts on time 2

      Let's go over the options.

      synctest.Wait

      This won't help because Wait returns as soon as time.Sleep inside workAsync is called. The done check fails, and synctest.Test panics with the error: "main bubble goroutine has exited but blocked goroutines remain".

      time.Sleep

      Because of the time.Sleep call in the root goroutine, the wait inside time.Sleep in workAsync is already over by the time done is checked. However, there's no guarantee that done.Store(true) has run yet. That's why the test might pass or might fail.

      synctest.Wait, then time.Sleep

      This option is basically the same as just using time.Sleep, because synctest.Wait returns before the time.Sleep in workAsync even starts. The test might pass or might fail.

      time.Sleep, then synctest.Wait

      This is the correct answer:

      1. Because of the time.Sleep call in the root goroutine, the wait inside time.Sleep in workAsync is already over by the time done is checked.
      2. Because of the synctest.Wait call, the workAsync goroutine is guaranteed to finish (and hence to call done.Store(true)) before done is checked.

      Nothing

      Since the root goroutine isn't blocked, it checks done while the workAsync goroutine is blocked by the time.Sleep call. The check fails, and synctest.Test panics with the message: "main bubble goroutine has exited but blocked goroutines remain".

      Checking for cancellation and stopping

      Sometimes you need to test objects that use resources and should be able to release them. For example, this could be a server that, when started, creates a pool of network connections, connects to a database, and writes file caches. When stopped, it should clean all this up.

      Let's see how we can make sure everything is properly stopped in the tests.

      Delayed stop

      We're going to test this server:

      // IncServer produces consecutive integers starting from 0.
      type IncServer struct {
          // ...
      }
      
      // NewIncServer creates a new server.
      func NewIncServer() *IncServer
      
      // Start runs the server in a separate goroutine and
      // sends numbers to the out channel until Stop is called.
      func (s *IncServer) Start(out chan<- int)
      
      // Stop shuts down the server.
      func (s *IncServer) Stop()
      

      Let's say we wrote a basic functional test:

      func Test(t *testing.T) {
          nums := make(chan int)
      
          srv := NewIncServer()
          srv.Start(nums)
          defer srv.Stop()
      
          got := [3]int{<-nums, <-nums, <-nums}
          want := [3]int{0, 1, 2}
          if got != want {
              t.Errorf("First 3: got: %v; want: %v", got, want)
          }
      }
      
      
      
      PASS
      

      The test passes, but does that really mean the server stopped when we called Stop? Not necessarily. For example, here's a buggy implementation where our test would still pass:

      // Start runs the server in a separate goroutine and
      // sends numbers to the out channel until Stop is called.
      func (s *IncServer) Start(out chan<- int) {
          go func() {
              for {
                  out <- s.current
                  s.current++
              }
          }()
      }
      
      // Stop shuts down the server.
      func (s *IncServer) Stop() {}
      

      As you can see, the author simply forgot to stop the server here. To detect the problem, we can wrap the test in synctest.Test and see it panic:

      func Test(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              nums := make(chan int)
      
              srv := NewIncServer()
              srv.Start(nums)
              defer srv.Stop()
      
              got := [3]int{<-nums, <-nums, <-nums}
              want := [3]int{0, 1, 2}
              if got != want {
                  t.Errorf("First 3: got: %v; want: %v", got, want)
              }
          })
      }
      
      
      
      panic: deadlock: main bubble goroutine has exited but blocked goroutines remain
      

      The server ignores the Stop call and doesn't stop the goroutine running inside Start. Because of this, the goroutine gets blocked while writing to the out channel. When synctest.Test finishes, it detects the blocked goroutine and panics.

      Let's fix the server code (to keep things simple, we won't support multiple Start or Stop calls):

      // IncServer produces consecutive integers starting from 0.
      type IncServer struct {
          current int
          done    chan struct{}
      }
      
      // Start runs the server in a separate goroutine and
      // sends numbers to the out channel until Stop is called.
      func (s *IncServer) Start(out chan<- int) {
          go func() {
              for {
                  select {
                  case out <- s.current:
                      s.current++
                  case <-s.done:
                      // Release used resources.
                      close(out)
                      return
                  }
              }
          }()
      }
      
      // Stop shuts down the server.
      func (s *IncServer) Stop() {
          close(s.done)
      }
      
      
      
      PASS
      

      Now the test passes. Here's how it works:

      1. The main test code runs.
      2. Before the test finishes, the deferred srv.Stop() is called.
      3. In the server goroutine, the <-src.done case in the select statement triggers, and the goroutine ends.
      4. synctest.Test sees that there are no blocked goroutines and finishes without panicking.

      T.Cleanup

      Instead of using defer to stop something, it's common to use the T.Cleanup method. It registers a function that will run when the test finishes:

      func Test(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              nums := make(chan int)
      
              srv := NewIncServer()
              srv.Start(nums)
              t.Cleanup(srv.Stop)
      
              got := [3]int{<-nums, <-nums, <-nums}
              want := [3]int{0, 1, 2}
              if got != want {
                  t.Errorf("First 3: got: %v; want: %v", got, want)
              }
          })
      }
      
      
      
      PASS
      

      Functions registered with Cleanup run in last-in, first-out (LIFO) order, after all deferred functions have executed.

      In the test above, there's not much difference between using defer and Cleanup. But the difference becomes important if we move the server setup into a separate helper function, so we don't have to repeat the setup code in different tests:

      func Test(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              nums := newServer(t)
              got := [3]int{<-nums, <-nums, <-nums}
              want := [3]int{0, 1, 2}
              if got != want {
                  t.Errorf("First 3: got: %v; want: %v", got, want)
              }
          })
      }
      

      The defer approach doesn't work because it calls Stop when newServer returns — before the test assertions run:

      func newServer(t *testing.T) <-chan int {
          t.Helper()
          nums := make(chan int)
      
          srv := NewIncServer()
          srv.Start(nums)
          defer srv.Stop()
      
          return nums
      }
      
      
      
      --- FAIL: Test (0.00s)
          main_test.go:48: First 3: got: [0 0 0]; want: [0 1 2]
      

      The t.Cleanup approach works because it calls Stop when synctest.Test has finished — after all the assertions have already run:

      func newServer(t *testing.T) <-chan int {
          t.Helper()
          nums := make(chan int)
      
          srv := NewIncServer()
          srv.Start(nums)
          t.Cleanup(srv.Stop)
      
          return nums
      }
      
      
      
      PASS
      

      T.Context

      Sometimes, a context (context.Context) is used to stop the server instead of a separate method. In that case, our server interface might look like this:

      // IncServer produces consecutive integers starting from 0.
      type IncServer struct {
          // ...
      }
      
      // Start runs the server in a separate goroutine and
      // sends numbers to the out channel until the context is canceled.
      func (s *IncServer) Start(ctx context.Context, out chan<- int)
      

      Now we don't even need to use defer or t.Cleanup to check whether the server stops when the context is canceled. Just pass t.Context() as the context:

      func Test(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              nums := make(chan int)
              server := new(IncServer)
              server.Start(t.Context(), nums)
      
              got := [3]int{<-nums, <-nums, <-nums}
              want := [3]int{0, 1, 2}
              if got != want {
                  t.Errorf("First 3: got: %v; want: %v", got, want)
              }
          })
      }
      
      
      
      PASS
      

      t.Context() returns a context that is automatically created when the test starts and is automatically canceled when the test finishes.

      Here's how it works:

      1. The main test code runs.
      2. Before the test finishes, the t.Context() context is automatically canceled.
      3. The server goroutine stops (as long as the server is implemented correctly and checks for context cancellation).
      4. synctest.Test sees that there are no blocked goroutines and finishes without panicking.

      Summary

      To check for stopping via a method or function, use defer or t.Cleanup().

      To check for cancellation or stopping via context, use t.Context().

      Inside a bubble, t.Context() returns a context whose channel is associated with the bubble. The context is automatically canceled when synctest.Test ends.

      Functions registered with t.Cleanup() inside the bubble run just before synctest.Test finishes.

      Bubble rules

      Let's go over the rules for living in the synctest bubble.

      General:

      • A bubble is created by calling synctest.Test. Each call creates a separate bubble.
      • Goroutines started inside the bubble become part of it.
      • The bubble can only manage durable blocks. Other types of blocks are invisible to it.

      synctest.Test:

      • If all goroutines in the bubble are durably blocked with no way to unblock them (such as by advancing the clock or returning from a synctest.Wait call), Test panics.
      • When Test finishes, it tries to wait for all child goroutines to complete. However, if even a single goroutine is durably blocked, Test panics.
      • Calling t.Context() returns a context whose channel is associated with the bubble.
      • Functions registered with t.Cleanup() run inside the bubble, immediately before Test returns.

      synctest.Wait:

      • Calling Wait in a bubble blocks the goroutine that called it.
      • Wait returns when all other goroutines in the bubble are durably blocked.
      • Wait returns when all other goroutines in the bubble have finished.

      Time:

      • The bubble uses a fake clock (starting at 2000-01-01 00:00:00 UTC).
      • Time in the bubble only moves forward if all goroutines are durably blocked.
      • Time advances by the smallest amount needed to unblock at least one goroutine.
      • If the bubble has to choose between moving time forward or returning from a running synctest.Wait, it returns from Wait.

      The following operations durably block a goroutine:

      • A blocking send or receive on a channel created within the bubble.
      • A blocking select statement where every case is a channel created within the bubble.
      • Calling Cond.Wait.
      • Calling WaitGroup.Wait if all WaitGroup.Add calls were made inside the bubble.
      • Calling time.Sleep.

      Limitations

      The synctest limitations are quite logical, and you probably won't run into them.

      Don't create channels or objects that contain channels (like tickers or timers) outside the bubble. Otherwise, the bubble won't be able to manage them, and the test will hang:

      func Test(t *testing.T) {
          ch := make(chan int)
          synctest.Test(t, func(t *testing.T) {
              go func() { <-ch }()
              synctest.Wait()
              close(ch)
          })
      }
      
      
      
      panic: test timed out after 3s
      

      Don't access synchronization primitives associated with a bubble from outside the bubble:

      func Test(t *testing.T) {
          var ch chan int
          synctest.Test(t, func(t *testing.T) {
              ch = make(chan int)
          })
          close(ch)
      }
      
      
      
      panic: close of synctest channel from outside bubble
      

      Don't call T.Run, T.Parallel, or T.Deadline inside a bubble:

      func Test(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              t.Run("subtest", func(t *testing.T) {
                  t.Log("ok")
              })
          })
      }
      
      
      
      panic: testing: t.Run called inside synctest bubble
      

      Don't call synctest.Test inside the bubble:

      func Test(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              synctest.Test(t, func(t *testing.T) {
                  t.Log("ok")
              })
          })
      }
      
      
      
      panic: synctest.Run called from within a synctest bubble
      

      Don't call synctest.Wait from outside the bubble:

      func Test(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              t.Log("ok")
          })
          synctest.Wait()
      }
      
      
      
      panic: goroutine is not in a bubble [recovered, repanicked]
      

      Don't call synctest.Wait concurrently from multiple goroutines:

      func Test(t *testing.T) {
          synctest.Test(t, func(t *testing.T) {
              go synctest.Wait()
              go synctest.Wait()
          })
      }
      
      
      
      panic: wait already in progress
      

      That's it!

      ✎ Exercise: Testing a pipeline

      Practice is crucial in turning abstract knowledge into skills, making theory alone insufficient. The full version of the book contains a lot of exercises — that's why I recommend getting it.

      If you are okay with just theory for now, let's continue.

      Keep it up

      The synctest package is a complicated beast. But now that you've studied it, you can test concurrent programs no matter what synchronization tools they use—channels, selects, wait groups, timers or tickers, or even time.Sleep.

      In the next chapter, we'll talk about concurrency internals (coming soon).

      Pre-order for $10 or read online

    13. 🔗 r/LocalLLaMA The most objectively correct way to abliterate so far - ArliAI/GLM-4.5-Air-Derestricted rss

      The most objectively correct way to abliterate so far - ArliAI/GLM-4.5-Air-Derestricted | Hi everyone, this is Owen Arli from Arli AI and this is the first model release we created in a while. We previously created models finetuned for more creativity with our RpR and RPMax models. After seeing the post by Jim Lai on Norm-Preserving Biprojected Abliteration here, I immediately thought that no one has done abliteration this way and that the "norm-preserving" part was a brilliant improvement in the method to abliterate models, and appears to me like it is objectively the best way to abliterate models. You can find the full technical details in his post, but I will explain the gist of it here.

      The problem:

      Typical abliteration methods finds the refusal vector and simply subtracts it from the weights, this causes the "length" (Norm) of the weight vectors to be altered. This is a problem because this "length" usually dictates how "important" a neuron is and how much it contributes, so changing it will cause damage to the model's general intelligence.

      The solution:

      This Norm-Preserving technique modifies the direction the weights point in, but forces them to keep their original length. Essentially, by removing the refusal in this way you can potentially also improve the model's performance instead of diminishing it. Trying out the Gemma 3 12B model example, it clearly works extremely well compared to regular abliteration methods that often leaves the model broken until further finetuning. Which explains why the model ranks so high in the UGI leaderboard even though its base was Gemma 3 12B which is a notoriously censored model.

      The result:

      Armed with a new 2xRTX Pro 6000 server I just built for Arli AI model experimentation, I set out to try and apply this abliteration technique to the much larger and smarter GLM-4.5-Air. Which ended up in what I think is undoubtedly one of the most interesting model I have ever used. Its not that GLM-4.5-Air is usually plagued with refusals, but using this "Derestricted" version feels like the model suddenly becomes free to do anything it wants without trying to "align" to a non-existent guideline either visibly or subconsciously. It's hard to explain without trying it out yourself. For an visible example, I bet that those of you running models locally or through an API will definitely have tried to add a system prompt that says "You are a person and not an AI" or something along those lines. Usually even with such a system prompt and nothing in the context that suggests it is an AI, the model will stubbornly still insist that it is an AI and it is unable to do "human- like" things. With this model, just adding that prompt immediately allows the model to pretend to act like a human in its response. No hesitation or any coaxing needed. The most impressive part about this abliteration technique is definitely the fact that it has somehow made the model a better instruction follower instead of just a braindead NSFW-capable model from typical abliteration. As for it's intelligence, it has not been benchmarked but I believe that just using the model and feeling it out to see if it has degraded in capabilities is better than just checking benchmarks. Which in this case, the model does feel like it is just as smart if not better than the original GLM-4.5-Air. You can find the model available on our API, or you can download them yourself from the HF links below! Model downloads:

      We will be working to create more of these Derestricted models, along with many new finetuned models too! submitted by /u/Arli_AI
      [link] [comments]
      ---|---

    14. 🔗 HexRaysSA/plugin-repository commits ci: accumulate known repositories rss
      ci: accumulate known repositories
      
      closes #8
      
    15. 🔗 HexRaysSA/plugin-repository commits readme: typo rss
      readme: typo
      
    16. 🔗 r/LocalLLaMA Qwen3-Next support in llama.cpp almost ready! rss
    17. 🔗 r/reverseengineering /r/ReverseEngineering's Weekly Questions Thread rss

      To reduce the amount of noise from questions, we have disabled self-posts in favor of a unified questions thread every week. Feel free to ask any question about reverse engineering here. If your question is about how to use a specific tool, or is specific to some particular target, you will have better luck on the Reverse Engineering StackExchange. See also /r/AskReverseEngineering.

      submitted by /u/AutoModerator
      [link] [comments]

    18. 🔗 HexRaysSA/plugin-repository commits sync repo: -1 plugin, -2 releases rss
      sync repo: -1 plugin, -2 releases
      
      ## Removed plugins
      - fwhunt-ida
      
    19. 🔗 langchain-ai/deepagents deepagents-cli==0.0.10 release

      Changes since deepagents-cli==0.0.9

      release(deepagents, cli) (#477)
      Harrison/fix diffing (#478)
      truncate glob (#473)
      fix(cli): 2nd argument must be called runtime (not _runtime) (#472)
      add file upload and download apis (#381)
      chore(cli): other lints (#464)
      feat: add option to disable splash on startup (#446)
      chore(cli): pull out interrupt on config (#463)
      Harrison/add gemini support (#456)
      chore(cli): remove internal file that's not needed (#462)
      chore(cli): apply auto-fixes for linting (#461)
      chore(cli): quick linting fixes (#460)
      chore(cli): remove hard-coded paths (#458)
      cli: inherit env variables for cli (#459)
      fix(deepagents-cli): fix linting (broke CI) (#457)
      feat(cli): add project-level skills support (#439)
      fix: localize key bindings and update tips for macOS compatibility (#451)
      chore: cleanup markdownlint errors in README.md (#447)
      fix cli rendering (#445)
      add auto accept option to CLI (#421)
      Remove unnecessary dependencies from deepagents module (#429)
      fix: use request.override instead of direct attribute overrides (#431)
      add missing type annotations (#425)
      chore(deepagent-cli): remove double diff display (WIP) (#387)
      Add skills and dual-scope memory to deepagents CLI (#315)
      use thread id rather than hardcoding to main (#423)

    20. 🔗 langchain-ai/deepagents deepagents==0.2.8 release

      Changes since deepagents==0.2.7

      release(deepagents, cli) (#477)
      return strings in tools (#474)
      truncate glob (#473)
      fix(deepagents): fix type for file info (#471)
      add file upload and download apis (#381)
      fix(deepagents-cli): fix linting (broke CI) (#457)
      fix(middleware): prevent silent corruption from Windows absolute paths (Issue #427) (#454)
      Remove unnecessary dependencies from deepagents module (#429)
      fix: use request.override instead of direct attribute overrides (#431)
      Update README.md (#422)

    21. 🔗 HexRaysSA/plugin-repository commits sync repo: +1 plugin, -1 plugin, +2 releases, -1 release rss
      sync repo: +1 plugin, -1 plugin, +2 releases, -1 release
      
      ## New plugins
      - [fwhunt-ida](https://github.com/binarly-io/fwhunt-ida) (1.0.2, 1.0.1)
      
      ## Removed plugins
      - gepetto
      
    22. 🔗 Ampcode News Try Claude Opus 4.5 With Us rss

      Anthropic just released Claude Opus 4.5. We've been testing it, right on the heels of the Gemini 3 release, and have been very, very impressed. Exciting times!

      We're eager to figure out where and how it can fit into Amp. As the main agent? As the oracle? As another subagent? There's a lot to explore.

      But one learning from the Gemini 3 launch is that early-access performance might not be indicative of final production performance. So we want to be careful and evaluate it properly.

      And, while cheaper than Opus 4.1, it is still more expensive than Gemini or Claude Sonnet 4.5. What we want to figure out now: can we make that worth it for you?

      So, for a limited time, you can try Claude Opus 4.5 as the main agent in smart mode in Amp along with us:

      • CLI: amp --try-opus or add "amp.tryOpus": true to ~/.config/amp/settings.json
      • Editor extension: Click Try Opus 4.5 in the Amp settings

      Let us know what you think.

  3. November 23, 2025
    1. 🔗 IDA Plugin Updates IDA Plugin Updates on 2025-11-23 rss

      IDA Plugin Updates on 2025-11-23

      New Releases:

      Activity:

    2. 🔗 livestorejs/livestore "v0.4.0-dev.18" release

      "Release 0.4.0-dev.18 including Chrome Extension"

    3. 🔗 r/wiesbaden Monatlicher Pen-and-Paper-Rollenspieltreff in Wiesbaden Schierstein rss

      Einmal im Monat bieten wir in der Phantasos Arena in Wiesbaden-Schierstein einen Rollenspiel-Treff an. Einsteiger und Profis sind herzlich willkommen, sowohl als Spieler als auch als Spielleiter, wir freuen uns immer über neue Gesichter in der Runde und neue Systeme. Der nächste Treff findet am 8.12. statt, und es wird weihnachtlich.

      Um besser planen zu können, bitten wir um eine Anmeldung via Discord: https://discord.gg/c82e3fYKyr

      Ort: Schossbergstraße 11, hinteres Bürogebäude, Wiesbaden Datum: 1x/Monat, nächstes mal am 8.12., 18.30 Uhr Systeme: Wechselnd, im Dezember zB Candela Obscura und ein W24-Abenteuer am Nordpol

      Hinweise: Der Betreiber der Location bittet um 5€ Nutzungsbeitrag pro Person für Miete, Strom etc; wer sich das nicht leisten kann, ist trotzdem willkommen. Die Location ist nicht barrierefrei.

      submitted by /u/Bitter-Secretary6006
      [link] [comments]

    4. 🔗 r/LocalLLaMA I created a llama.cpp fork with the Rockchip NPU integration as an accelerator and the results are already looking great! rss

      I created a llama.cpp fork with the Rockchip NPU integration as an accelerator and the results are already looking great! | submitted by /u/Inv1si
      [link] [comments]
      ---|---

    5. 🔗 idursun/jjui v0.9.6 release

      Another community driven release, mostly focusing on every day improvements.

      What's Changed

      • fix(ace_jump): close operation after applying jump by @baggiiiie in #351
      • feat: auto-detect preview placement, on by default by @lbonn in #348
      • feat: allow remote selection in git menu by @baggiiiie in #349
      • fix(fuzzy_files): quote path for editor command by @baggiiiie in #354
      • fix for preview copied status file and filename with {} by @baggiiiie in #357
      • Support interactive file-level restores by @remo5000 in #365
      • fuzzy_files: remove quoting for files in revset by @baggiiiie in #370
      • ui/git: add --tracked to git fetch by @baggiiiie in #368
      • operation: add ForceApply to squash/rebase footer by @baggiiiie in #371
      • abandon: add force apply to footer help menu by @baggiiiie in #373
      • docs(README): add winget and scoop as installation method in Windows by @abgox in #362
      • fix(set_parents): keep parent order by @idursun in #375
      • fix(light theme): Removed the default border style of bright white as it was overriding the text style which should be the terminal’s default foreground and background.

      New Contributors

      Full Changelog : v0.9.5...v0.9.6

    6. 🔗 HexRaysSA/plugin-repository commits sync repo: +1 plugin, +1 release rss
      sync repo: +1 plugin, +1 release
      
      ## New plugins
      - [gepetto](https://github.com/JusticeRage/Gepetto) (1.5.0)
      
    7. 🔗 r/LocalLLaMA Computer Manufacturer threw my $ 20000 rig down the stairs and now says everything is fine rss

      I bought a custom built Threadripper Pro water-cooled dual RTX 4090 workstation from a builder and had it updated a couple of times with new hardware so that finally it became a rig worth about $20000.

      Upon picking up the machine last week from the builder after another upgrade I asked staff that we check together the upgrade before paying and confirming the order fulfilled.

      They lifted the machine (still in its box and secured with two styrofoam blocks), on a table, but the heavy box (30kg) slipped from their hands, the box fell on the floor and from there down a staircase where it cartwheeled several times until it stopped at the end of the stairs.

      They sent a mail saying they checked the machine and everything is fine.

      Who wouldn't expect otherwise.

      Can anyone comment on possible damages such an incident can have on the electronics, PCIe Slots, GPUs, watercooling, mainboard etc, — also on what damages might have occurred that are not immediately evident, but could e.g. impact signal quality and therefore speed? Would you accept back such a machine?

      Thanks.

      submitted by /u/phwlarxoc
      [link] [comments]

    8. 🔗 r/wiesbaden Black-/ Death-Metall Band gründen? rss

      Hat wer Lust? Gerne Richtung BoltThrower, Obituary etc... Musikalisch stumpf, wenig Virtuosität, böse gucken und so. Mental auch klassisch britisch, also links-grün hardcore-humanistisch mit kollektivistischem Ansatz. Gender egal, Hauptsache Proberaum. Bin alt, männlich und verbittert; spiele aber ganz o.k. Gitarre. Cheerio & Ta-da.

      submitted by /u/Last-Wolverine-1774
      [link] [comments]

    9. 🔗 r/wiesbaden Bauingenieur Julius Berger rss
    10. 🔗 r/LocalLLaMA No way kimi gonna release new model !! rss
    11. 🔗 r/wiesbaden Steam trains around Frankfurt in the coming weeks, incl. one from Wiesbaden rss
    12. 🔗 HexRaysSA/plugin-repository commits sync repo: -1 plugin, -1 release rss
      sync repo: -1 plugin, -1 release
      
      ## Removed plugins
      - ida-terminal-plugin
      
    13. 🔗 Register Spill Joy & Curiosity #63 rss

      This was the second time in four months that I happened to be in San Francisco when a new model was released by a major provider. "Gemini 3 just dropped" was overheard in the coffee shop.

      Very busy but fantastic week.

      • We switched Amp to Gemini 3 Pro. I know that for some people Gemini 3 feels off, but I honestly think it's the best model I've used as an agent so far. It's fantastic.

      • I'm still holding my breath and I'm pessimistic, but hallelujah! if this actually happens in a way that's noticeable: Europe is scaling back its landmark privacy and AI laws.

      • What I found the most surprising about Every's Vibe Check on Gemini 3 is that they're saying it's "not yet a writing champ". Maybe I can't judge it on that, I very rarely use LLMs to generate more than a single sentence of prose, but I thought Gemini 3's descriptions of bugs, its summaries of what it did, its investigations -- I thought those were well-written and surprisingly well-formatted too. I actually said out loud: "if I would see a PR description written like that , I'd try to hire the person who wrote it."

      • Paul Dix, CTO of InfluxDB: "I believe the next year will show that the role of the traditional software engineer is dead. If you got into this career because you love writing lines of code, I have some bad news for you: it's over. The machines will be writing most of the code from here on out. Although there is some artisanal stuff that will remain in the realm of hand written code, it will be deeply in the minority of what gets produced."

      • "Older programs were all about what you need: you can do this, that, whatever you want, just let me know. You were in control, you were giving orders, and programs obeyed. But recently (a decade, more or less), this relationship has subtly changed. Newer programs (which are called apps now, yes, I know) started to want things from you."

      • Are You Playing to Play, or Playing to Win? Read it two days ago and already think that these definitions of scrub and maestro will stick with me. "One time I played a scrub who was pretty good at many aspects of Street Fighter, but he cried cheap as I beat him with 'no skill moves' while he performed many difficult dragon punches. He cried cheap when I threw him 5 times in a row asking, 'is that all you know how to do? throw?' I told him, 'Play to win, not to do 'difficult moves.'' He would never reach the next level of play without shedding those extra rules in his head". Can't help but think of programming and typing code by hand.

      • "#! was a great hack to make scripts look and feel like real executable binaries.", from: #! magic, details about the shebang/hash-bang mechanism on various Unix flavours.

      • "A friend of mine tells Claude to always address him as 'Mr Tinkleberry', he says he can tell when Claude is not paying attention to the instructions on CLAUDE.md when Claude stops calling him 'Mr Tinkleberry' consistently"

      • This is from all the way back in April and you actually notice that when reading, I'd say, which is interesting in itself, but the whole piece is great and contains a lot of gems: Will the Humanities Survive Artificial Intelligence?

      • "I've started a company in this space about 2 years ago. We are doing fine. What we've learned so far is that a lot of these techniques are simply optimisations to tackle some deficiency in LLMs that is a problem "today". These are not going to be problems tomorrow because the technology will shift. As it happened many time in the span of the last 2 years. So yah, cool, caching all of that... but give it a couple of months and a better technique will come out - or more capable models. [...] What I've learned from this is that often times it is better to do absolutely nothing."

      • Joan Didion, On Self-Respect: "In brief, people with self-respect exhibit a certain toughness, a kind of moral nerve; they display what was once called character, a quality which, although approved in the abstract, sometimes loses ground to other, more instantly negotiable virtues. The measure of its slipping prestige is that one tends to think of it only in connection with homely children and with United States senators who have been defeated, preferably in the primary, for re-election. Nonetheless, character--the willingness to accept responsibility for one's own life--is the source from which self-respect springs."

      If your travel also maps to model releases, you should subscribe:

    14. 🔗 r/LocalLLaMA Physical documentation for LLMs in Shenzhen bookstore selling guides for DeepSeek, Doubao, Kimi, and ChatGPT. rss

      Physical documentation for LLMs in Shenzhen bookstore selling guides for DeepSeek, Doubao, Kimi, and ChatGPT. | submitted by /u/abdouhlili
      [link] [comments]
      ---|---

    15. 🔗 r/reverseengineering Luau VM Bytecode Injection via Manual Mapping rss
  4. November 22, 2025
    1. 🔗 IDA Plugin Updates IDA Plugin Updates on 2025-11-22 rss

      IDA Plugin Updates on 2025-11-22

      New Releases:

      Activity:

    2. 🔗 Simon Willison Olmo 3 is a fully open LLM rss

      Olmo is the LLM series from Ai2 - the Allen institute for AI. Unlike most open weight models these are notable for including the full training data, training process and checkpoints along with those releases.

      The new Olmo 3 claims to be "the best fully open 32B-scale thinking model" and has a strong focus on interpretability:

      At its center is Olmo 3-Think (32B), the best fully open 32B-scale thinking model that for the first time lets you inspect intermediate reasoning traces and trace those behaviors back to the data and training decisions that produced them.

      They've released four 7B models - Olmo 3-Base, Olmo 3-Instruct, Olmo 3-Think and Olmo 3-RL Zero, plus 32B variants of the 3-Think and 3-Base models.

      Having full access to the training data is really useful. Here's how they describe that:

      Olmo 3 is pretrained on Dolma 3, a new ~9.3-trillion-token corpus drawn from web pages, science PDFs processed with olmOCR, codebases, math problems and solutions, and encyclopedic text. From this pool, we construct Dolma 3 Mix, a 5.9-trillion-token (~6T) pretraining mix with a higher proportion of coding and mathematical data than earlier Dolma releases, plus much stronger decontamination via extensive deduplication, quality filtering, and careful control over data mixing. We follow established web standards in collecting training data and don't collect from sites that explicitly disallow it, including paywalled content.

      They also highlight that they are training on fewer tokens than their competition:

      [...] it's the strongest fully open thinking model we're aware of, narrowing the gap to the best open-weight models of similar scale – such as Qwen 3 32B – while training on roughly 6x fewer tokens.

      If you're continuing to hold out hope for a model trained entirely on licensed data this one sadly won't fit the bill - a lot of that data still comes from a crawl of the web.

      I tried out the 32B Think model and the 7B Instruct model using LM Studio. The 7B model is a 4.16GB download, the 32B one is 18.14GB.

      The 32B model is absolutely an over-thinker! I asked it to "Generate an SVG of a pelican riding a bicycle" and it thought for 14 minutes 43 seconds, outputting 8,437 tokens total most of which was this epic thinking trace.

      I don't usually quote the full SVG in these write-ups, but in this case it's short enough that I think it's worth sharing. The SVG comments give a great impression of what it was trying to do - it has a Bicycle, Bike frame, Pelican, Left and Right wings and even "Feet on pedals".

      <svg width="200" height="200" viewBox="0 0 100 100">
        <!-- Bicycle -->
        <circle cx="30" cy="60" r="15" stroke="black" fill="none"/>
        <circle cx="70" cy="60" r="15" stroke="black" fill="none"/>
      
        <!-- Bike frame -->
        <rect x="35" y="25" width="30" height="10" fill="saddlebrown"/>
        <line x1="35" y1="40" x2="30" y2="60" stroke="black" stroke-width="3"/>
        <line x1="65" y1="40" x2="70" y2="60" stroke="black" stroke-width="3"/>
      
        <!-- Pelican -->
        <ellipse cx="55" cy="65" rx="20" ry="15" fill="white"/>
        <polygon points="52 50,57 35,62 50" fill="black"/> <!-- Head/beak -->
        <circle cx="55" cy="45" r="2" fill="white"/>
        <circle cx="60" cy="45" r="2" fill="white"/>
      
        <polygon points="45 60,50 70,55 60" fill="lightgrey"/> <!-- Left wing -->
        <polygon points="65 60,70 70,55 60" fill="lightgrey"/> <!-- Right wing -->
      
        <!-- Feet on pedals -->
        <polygon points="25 75,30 85,35 75" fill="black"/>
        <polygon points="75 75,70 85,65 75" fill="black"/>
      </svg>

      Rendered it looks like this:

      Two circles, each with a triangle sticking out from the bottom. They have bars leading up to a brown box. Overlapping them is a black triangle with white circles for eyes and two grey triangles that are probably meant to be wings. It is not recognizable as a pelican or a bicycle.

      I tested OLMo 2 32B 4bit back in March and got something that, while pleasingly abstract, didn't come close to resembling a pelican or a bicycle:

      Blue and black wiggly lines looking more like a circuit diagram than a pelican riding a bicycle

      To be fair 32B models generally don't do great with this. Here's Qwen 3 32B's attempt (I ran that just now using OpenRouter):

      The bicycle is two black circles joined by two lines, with a weird rectangular saddle perched on top The pelican is a blue oval, a white circles with a yellow triangle in it and a weird eye shaped oval overlapping the blue one.

      OlmoTrace

      I was particularly keen on trying out the ability to "inspect intermediate reasoning traces". Here's how that's described later in the announcement:

      A core goal of Olmo 3 is not just to open the model flow, but to make it actionable for people who want to understand and improve model behavior. Olmo 3 integrates with OlmoTrace, our tool for tracing model outputs back to training data in real time.

      For example, in the Ai2 Playground, you can ask Olmo 3-Think (32B) to answer a general-knowledge question, then use OlmoTrace to inspect where and how the model may have learned to generate parts of its response. This closes the gap between training data and model behavior: you can see not only what the model is doing, but why---and adjust data or training decisions accordingly.

      You can access OlmoTrace via playground.allenai.org, by first running a prompt and then clicking the "Show OlmoTrace" button below the output.

      I tried that on "Generate a conference bio for Simon Willison" (an ego-prompt I use to see how much the models have picked up about me from their training data) and got back a result that looked like this:

      Screenshot of AI2 Playground interface with Olmo 3 32B Think model selected. User prompt reads "Generate a conference bio for Simon Willison". AI response shows expanded "Thoughts" section: "Okay, the user wants me to generate a conference bio for Simon Willison. Let me start by recalling who Simon is. He's a well-known developer, entrepreneur, and author in the tech community. I remember he's the co-founder of co:here and now works at Anthropic. He's famous for his work with SQLite, especially the datasette project. Also, he writes a blog called SimonWillison.net which is quite popular. First, I need to structure a typical conference bio. These usually include his current role, past positions, notable projects, and maybe some achievements or recognitions. I should highlight his key contributions like datasette, which allows easy exploration of SQLite databases via a web interface. That's a big deal in data journalism and open data circles." Right panel shows "OlmoTrace" feature described as "Documents from the training data that have exact text matches with the model response. Powered by infini-gram". First document excerpt discusses technology and innovation, with highlighted match text "societal implications of technology, emphasizing the" shown in bold, surrounded by text about responsibility and merging innovation with intellect. Second document excerpt about Matt Hall has highlighted match "is a software engineer and entrepreneur based in" shown in bold, describing someone in New York City who co-founded a PFP collection and works at Google Creative Lab. Note indicates "Document repeated 2 times in result" with "View all repeated documents" link.

      It thinks I co-founded co:here and work at Anthropic, both of which are incorrect - but that's not uncommon with LLMs, I frequently see them suggest that I'm the CTO of GitHub and other such inaccuracies.

      I found the OlmoTrace panel on the right disappointing. None of the training documents it highlighted looked relevant - it appears to be looking for phrase matches (powered by Ai2's infini-gram) but the documents it found had nothing to do with me at all.

      Can open training data address concerns of backdoors?

      Ai2 claim that Olmo 3 is "the best fully open 32B-scale thinking model", which I think holds up provided you define "fully open" as including open training data. There's not a great deal of competition in that space though - Ai2 compare themselves to Stanford's Marin and Swiss AI's Apertus, neither of which I'd heard about before.

      A big disadvantage of other open weight models is that it's impossible to audit their training data. Anthropic published a paper last month showing that a small number of samples can poison LLMs of any size - it can take just "250 poisoned documents" to add a backdoor to a large model that triggers undesired behavior based on a short carefully crafted prompt.

      This makes fully open training data an even bigger deal.

      Ai2 researcher Nathan Lambert included this note about the importance of transparent training data in his detailed post about the release:

      In particular, we're excited about the future of RL Zero research on Olmo 3 precisely because everything is open. Researchers can study the interaction between the reasoning traces we include at midtraining and the downstream model behavior (qualitative and quantitative).

      This helps answer questions that have plagued RLVR results on Qwen models, hinting at forms of data contamination particularly on math and reasoning benchmarks (see Shao, Rulin, et al. "Spurious rewards: Rethinking training signals in rlvr." arXiv preprint arXiv:2506.10947 (2025). or Wu, Mingqi, et al. "Reasoning or memorization? unreliable results of reinforcement learning due to data contamination." arXiv preprint arXiv:2507.10532 (2025).)

      I hope we see more competition in this space, including further models in the Olmo series. The improvements from Olmo 1 (in February 2024) and Olmo 2 (in March 2025) have been significant. I'm hoping that trend continues!

      You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options.

    3. 🔗 r/wiesbaden St. Mauritius Church in Wiesbaden, Germany. (This is great) rss
    4. 🔗 r/reverseengineering Reverse Engineering AI Behavior with Structured Data Formats rss
    5. 🔗 Locklin on science Wacky fun physics ideas rss

      My reading lately has ventured into weird physics papers. Mainstream physics (unlike machine learning and classical statistics where real progress has been made) is booooring these days. There’s no point in reading another “shittonium on silicon 111” papers, nor am I interested in stupid big budget projects where people always get the expected answer, nor […]

    6. 🔗 r/LocalLLaMA Qwen-image-edit-2511 coming next week rss

      Qwen-image-edit-2511 coming next week | submitted by /u/abdouhlili
      [link] [comments]
      ---|---

    7. 🔗 gulbanana/gg GG 0.29.1 release

      Added

      • "New inserted parent" menu item (thanks to @brk).
      • Squash sub-file hunks from the right pane (thanks to @nightscape).
      • Restore sub-file hunks in the same way as changes.
      • Show recent workspaces if opening a workspace failed (thanks to @Natural-selection1).
      • Change and commit ID can be selected for copying.

      Fixed

      • Fix overscroll on MacOS (thanks to @mernen).
      • Compress path and action info when window is too narrow (thanks to @Natural-selection1).
      • Use from_utf8_lossy to prevent invalid utf-8 sequence errors (thanks to @jmole).
      • Enabled LTO for release builds, smaller and faster binary (thanks to @berkus).
    8. 🔗 r/reverseengineering Autonomous exploitation + trace analysis workflows with CAI (open-source) rss
    9. 🔗 r/reverseengineering Quantum Silicon Core Loader v5.5 Released - Universal Micro-VM for Hardware rss
    10. 🔗 Kagi release notes Nov 22nd, 2025 - Kagi Hub Belgrade rss

      Kagi Hub Belgrade: Making the human web real

      We just opened the Kagi Hub in Belgrade, Serbia!

      If you’re a Kagi member,you can book up to 5 FREE reservations per month and treat the Hub as your base whenever you’re in Belgrade. It is the same space our team uses, so you will be working directly alongside the people shaping Kagi’s future. More details, including how to reserve your spot, are in this blog post: https://blog.kagi.com/kagi- hub

      Having an actual physical space makes our mission to "humanize the web" feel so much more real. It is a place for Kagi members and our fully remote team to work, trade ideas, and build the tools we all wish existed.

      We are looking forward to welcoming you to Kagi's first ever Hub!

    11. 🔗 r/reverseengineering Reverse Engineering Casio's .CR5 File Format rss
    12. 🔗 HexRaysSA/plugin-repository commits sync repo: +1 plugin, +1 release rss
      sync repo: +1 plugin, +1 release
      
      ## New plugins
      - [ida-terminal-plugin](https://github.com/HexRaysSA/ida-terminal-plugin) (0.0.6)
      
    13. 🔗 r/LocalLLaMA GLM planning a 30-billion-parameter model release for 2025 rss

      GLM planning a 30-billion-parameter model release for 2025 | submitted by /u/aichiusagi
      [link] [comments]
      ---|---

    14. 🔗 matklad TigerBeetle Blog rss

      TigerBeetle Blog

      Nov 22, 2025

      Continuing the tradition, I’ve been also blogging somewhat regularly on TigerBeetle’s blog, so you might want to check those articles out or even subscribe (my favorite RSS reader is RSSSSR):

      Today’s post is a video version of Notes on Paxos!

    15. 🔗 Filip Filmar rules_shar: bazel rules for creating self-extracting archives (shars) rss

      Details at: https://github.com/filmil/rules_shar

      This repository provides Bazel rules for creating self-extracting archives (“shar"s) out of shell binaries. See section “what is this” below for details. Usage To use the rules, add the following to your MODULE.bazel file: bazel_dep(name = "rules_shar", version = "0.0.0") # Select your version, of course. Then, in your BUILD.bazel file, you can use the shar_binary rule to create a self-extracting archive, as a drop-in replacement for the rule sh_binary.