🏡


to read (pdf)

  1. The Code-Only Agent • Rijnard van Tonder
  2. Agent-native Architectures: How to Build Apps After Code Ends
  3. jj tug | Shaddy's
  4. Don't fall into the anti-AI hype -
  5. sorted string tables (SST) from first principles

  1. January 12, 2026
    1. 🔗 19h/ida-structor v0.3.0 release

      Full Changelog : v0.2.0...v0.3.0

    2. 🔗 r/reverseengineering preserving tomotoru rss
    3. 🔗 r/reverseengineering /r/ReverseEngineering's Weekly Questions Thread rss

      To reduce the amount of noise from questions, we have disabled self-posts in favor of a unified questions thread every week. Feel free to ask any question about reverse engineering here. If your question is about how to use a specific tool, or is specific to some particular target, you will have better luck on the Reverse Engineering StackExchange. See also /r/AskReverseEngineering.

      submitted by /u/AutoModerator
      [link] [comments]

    4. 🔗 HexRaysSA/plugin-repository commits sync repo: +1 release rss
      sync repo: +1 release
      
      ## New releases
      - [unicorn-tracer-arm64](https://github.com/chenxvb/Unicorn-Trace): 0.2
      
    5. 🔗 @cxiao@infosec.exchange [#KPOPDEMONHUNTERS](https://infosec.exchange/tags/KPOPDEMONHUNTERS) FOR BEST mastodon

      #KPOPDEMONHUNTERS FOR BEST ANIMATED FILM #GoldenGlobes #Huntrix

    6. 🔗 @cxiao@infosec.exchange Hello [#kpopmonday](https://infosec.exchange/tags/kpopmonday)! For mastodon

      Hello #kpopmonday! For #AbbreviationsAndAcronyms, let's bring in EXID's DDD!

      I'm posting this specific performance instead of the MV because a) the MV has flashing lights, b) the outfits here are all time best, and c) Solji is back in this performance!!!

      https://youtu.be/ITLZy6i7AD8

      #EXID #kpop

    7. 🔗 @cxiao@infosec.exchange GOLDEN BEST ORIGINAL SONG AT GOLDEN GLOBES LFG mastodon

      GOLDEN BEST ORIGINAL SONG AT GOLDEN GLOBES LFG

      #KPOPDEMONHUNTERS #GoldenGlobes #Huntrix

    8. 🔗 badlogic/pi-mono v0.43.0 release

      Breaking Changes

      • Extension editor (ctx.ui.editor()) now uses Enter to submit and Shift+Enter for newlines, matching the main editor. Previously used Ctrl+Enter to submit. Extensions with hardcoded "ctrl+enter" hints need updating. (#642 by @mitsuhiko)
      • Renamed /branch command to /fork (#641)
        • RPC: branchfork, get_branch_messagesget_fork_messages
        • SDK: branch()fork(), getBranchMessages()getForkMessages()
        • AgentSession: branch()fork(), getUserMessagesForBranching()getUserMessagesForForking()
        • Extension events: session_before_branchsession_before_fork, session_branchsession_fork
        • Settings: doubleEscapeAction: "branch" | "tree""fork" | "tree"
      • SessionManager.list() and SessionManager.listAll() are now async, returning Promise<SessionInfo[]>. Callers must await them. (#620 by @tmustier)

      Added

      • /resume selector now toggles between current-folder and all sessions with Tab, showing the session cwd in the All view and loading progress. (#620 by @tmustier)
      • SessionManager.list() and SessionManager.listAll() accept optional onProgress callback for progress updates
      • SessionInfo.cwd field containing the session's working directory (empty string for old sessions)
      • SessionListProgress type export for progress callbacks
      • /scoped-models command to enable/disable models for Ctrl+P cycling. Changes are session-only by default; press Ctrl+S to persist to settings.json. (#626 by @CarlosGtrz)
      • model_select extension hook fires when model changes via /model, model cycling, or session restore with source field and previousModel (#628 by @marckrenn)
      • ctx.ui.setWorkingMessage() extension API to customize the "Working..." message during streaming (#625 by @nicobailon)
      • Skill slash commands: loaded skills are registered as /skill:name commands for quick access. Toggle via /settings or skills.enableSkillCommands in settings.json. (#630 by @Dwsy)
      • Slash command autocomplete now uses fuzzy matching (type /skbra to match /skill:brave-search)
      • /tree branch summarization now offers three options: "No summary", "Summarize", and "Summarize with custom prompt". Custom prompts are appended as additional focus to the default summarization instructions. (#642 by @mitsuhiko)

      Fixed

      • Session picker respects custom keybindings when using --resume (#633 by @aos)
      • Custom footer extensions now see model changes: ctx.model is now a getter that returns the current model instead of a snapshot from when the context was created (#634 by @ogulcancelik)
      • Footer git branch not updating after external branch switches. Git uses atomic writes (temp file + rename), which changes the inode and breaks fs.watch on the file. Now watches the directory instead.
      • Extension loading errors are now displayed to the user instead of being silently ignored (#639 by @aliou)
    9. 🔗 Drew DeVault's blog Redesigning my microkernel from the ground up rss

      As you may recall, circa 2022-2023 I was working on a microkernel written in Hare named Helios. Helios was largely inspired by and modelled after the design of seL4 and was my first major foray into modern OS development that was serious enough to get to a somewhat useful state of functionality, with drives for some real hardware, filesystems, and an environment for running user programs of a reasonable level of sophistication.

      Helios development went strong for a while but eventually it slowed and eventually halted in a state of design hell. Since Helios was my first major OS project at this scale and with this much ambition, the design and implementation ended up with a lot of poor assumptions that made it a pretty weak foundation for building a complete OS upon. In late 2023 I more or less gave up on it and moved my OS development work out of the realm of writing code and back into the realm of thinking really hard about how to design operating systems.

      What followed was a couple of years of design thinking, developing small scale design experiments, and doing deeper research into prior art – reading papers and studying existing kernels. It was also during this period that I wrote Bunnix, a working Unix clone, motivated in part by a desire to gain some first-hand experience working in the design and implementation of Unix-style operating systems – a fertile environment for learning a lot of the nuts and bolts of OS implementations by working against a complete and proven design.

      In August I was finally prepared to have another go. I decided to start over from scratch, importing and adapting and rewriting code from Helios and Bunnix on an as-needed basis to speed things up, and writing from scratch anything where the lessons learned in hindsight outweighed the benefits of adapting existing code.1

      The result is Hermes.

      Hermes has not yet reached feature parity with Helios, lacking some IPC features and an aarch64 port, but already it’s significantly more robust and thoughtfully designed than Helios.

      The big glitzy feature that most obviously distinguishes Hermes from Helios is that Hermes supports symmetric multiprocessing (SMP), which is to say, running on multiple CPU cores. This time around, I finally listened to the advice I’d been hearing in osdev circles for years and implemented SMP as early as possible to avoid dealing with tons of problems adding multiprocessing to an otherwise mature kernel.

      The multicore scheduler at the heart of Hermes is surprisingly simple, actually. It uses relatively ordinary per-CPU run queues. Each new task, once scheduleable, is scheduled, in order of preference, on (1) the CPU matching its affinity, (2) any currently idle CPU, or (3) a random CPU. If a CPU would idle, it first tries to steal a pending task from another CPU. The most important parts of the scheduler are less than 200 lines of code ([1], [2]).

      The less obviously impressive improvements from Helios to Hermes are numerous. The syscall and IPC ABIs were rethought from the ground up – one of the major goals of the redesign. I also moved from an seL4-style capability derivation graph – which is quite complex to implement and reason about – to reference counting to manage the lifetimes of kernel resources. Resource management in general is much simpler and should improve the performance of the kernel substantially.

      I’ve also taken a much different approach to organizing the code, to allow the kernel and many of the things around it – its bootloaders and the userspace that runs the kernel test suite – share a lot more code than was possible in Helios, making a lot of the non-kernel code a lot easier to write and maintain.

      The userspace is also a substantial upgrade in design from Helios, or at least I hope it will be when more of it takes shape. Rather than developing a specialized Hare standard library, independent of the upstream Hare standard library, for writing drivers and low-level services, I have started with a port of the upstream Hare standard library and built low-level driver and service support libraries around it. The userspace is streamlined considerably by doing so, giving these low-level components access to a more comfortable and featureful programming environment and reducing the complexity of the system by making various components more uniform in their design.

      Finally, I’ve taken a much more serious approach to testing Hermes and making it as robust and complete as possible in real-world use-cases. I borrowed the EFI bootloader from Bunnix and repurposed it for Hermes, opening up a lot of newer hardware, and I have written a more comprehensive test suite and run and verified it on much more real-world hardware. I have about ten devices which all (consistently!) pass the Hermes test suite. Feel free to try it out on yours as well and let me know how it goes!

      That’s all there is to say for now, but I hope to keep you in the loop as I continue working on this for a while. The userspace is starting to take shape and soon(™) I hope to start building out block device drivers, some filesystems, and enough support code to run a shell and a handful of useful programs. In the meantime, feel free to poke around the code and play around with it. There is also some early documentation available for you to read if you wish. I’m hanging out in #ares on Libera Chat if you have any questions.


      1. I also took the opportunity to acknowledge critics of my internally inconsistent naming scheme, and started choosing codenames within a single pantheon. ↩︎

  2. January 11, 2026
    1. 🔗 IDA Plugin Updates IDA Plugin Updates on 2026-01-11 rss

      IDA Plugin Updates on 2026-01-11

      New Releases:

      Activity:

      • ida-structor
        • 50495efc: docs: Document the Type Inference System and expand configuration guide
        • 5d77ea7c: Add spec documents
        • eaf1e30b: feat: Implement Z3-based type inference engine to refine variable and…
      • rag-rte-mcp
        • ce2ccf9f: Apply Manus 1.6 Max fixes: config portability, exception handling, docs
        • d4af60f1: Initial commit: Svony RAG + RTE MCP Toolkit v1.0.0
      • smd_ida_tools2
        • 25e629b0: Fixed refs creation for move(a) instructions.
        • f7a3c180: Adding refs to move.w label(pc, instructions.
        • 189cfcd0: Adding refs for move.l label(pc), instructions.
    2. 🔗 19h/ida-structor v0.2.0 release

      Full Changelog : v0.1.0...v0.2.0

    3. 🔗 Simon Willison My answers to the questions I posed about porting open source code with LLMs rss

      Last month I wrote about porting JustHTML from Python to JavaScript using Codex CLI and GPT-5.2 in a few hours while also buying a Christmas tree and watching Knives Out 3. I ended that post with a series of open questions about the ethics and legality of this style of work. Alexander Petros on lobste.rs just challenged me to answer them, which is fair enough! Here's my attempt at that.

      You can read the original post for background, but the short version is that it's now possible to point a coding agent at some other open source project and effectively tell it "port this to language X and make sure the tests still pass" and have it do exactly that.

      Here are the questions I posed along with my answers based on my current thinking. Extra context is that I've since tried variations on a similar theme a few more times using Claude Code and Opus 4.5 and found it to be astonishingly effective.

      I decided that the right thing to do here was to keep the open source license and copyright statement from the Python library author and treat what I had built as a derivative work, which is the entire point of open source.

      After sitting on this for a while I've come down on yes, provided full credit is given and the license is carefully considered. Open source allows and encourages further derivative works! I never got upset at some university student forking one of my projects on GitHub and hacking in a new feature that they used. I don't think this is materially different, although a port to another language entirely does feel like a slightly different shape.

      Does this format of development hurt the open source ecosystem?

      Now this one is complicated!

      It definitely hurts some projects because there are open source maintainers out there who say things like "I'm not going to release any open source code any more because I don't want it used for training" - I expect some of those would be equally angered by LLM-driven derived works as well.

      I don't know how serious this problem is - I've seen angry comments from anonymous usernames, but do they represent genuine open source contributions or are they just angry anonymous usernames?

      If we assume this is real, does the loss of those individuals get balanced out by the increase in individuals who CAN contribute to open source because they can now get work done in a few hours that might previously have taken them a few days that they didn't have to spare?

      I'll be brutally honest about that question: I think that if "they might train on my code / build a derived version with an LLM" is enough to drive you away from open source, your open source values are distinct enough from mine that I'm not ready to invest significantly in keeping you. I'll put that effort into welcoming the newcomers instead.

      The much bigger concern for me is the impact of generative AI on demand for open source. The recent Tailwind story is a visible example of this - while Tailwind blamed LLMs for reduced traffic to their documentation resulting in fewer conversions to their paid component library, I'm suspicious that the reduced demand there is because LLMs make building good-enough versions of those components for free easy enough that people do that instead.

      I've found myself affected by this for open source dependencies too. The other day I wanted to parse a cron expression in some Go code. Usually I'd go looking for an existing library for cron expression parsing - but this time I hardly thought about that for a second before prompting one (complete with extensive tests) into existence instead.

      I expect that this is going to quite radically impact the shape of the open source library world over the next few years. Is that "harmful to open source"? It may well be. I'm hoping that whatever new shape comes out of this has its own merits, but I don't know what those would be.

      I'm not a lawyer so I don't feel credible to comment on this one. My loose hunch is that I'm still putting enough creative control in through the way I direct the models for that to count as enough human intervention, at least under US law, but I have no idea.

      Is it responsible to publish software libraries built in this way?

      I've come down on "yes" here, again because I never thought it was irresponsible for some random university student to slap an Apache license on some bad code they just coughed up on GitHub.

      What's important here is making it very clear to potential users what they should expect from that software. I've started publishing my AI-generated and not 100% reviewed libraries as alphas, which I'm tentatively thinking of as "alpha slop". I'll take the alpha label off once I've used them in production to the point that I'm willing to stake my reputation on them being decent implementations, and I'll ship a 1.0 version when I'm confident that they are a solid bet for other people to depend on. I think that's the responsible way to handle this.

      How much better would this library be if an expert team hand crafted it over the course of several months?

      That one was a deliberately provocative question, because for a new HTML5 parsing library that passes 9,200 tests you would need a very good reason to hire an expert team for two months (at a cost of hundreds of thousands of dollars) to write such a thing. And honestly, thanks to the existing conformance suites this kind of library is simple enough that you may find their results weren't notably better than the one written by the coding agent.

      You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options.

    4. 🔗 r/reverseengineering Hades game reversing/patching rss
    5. 🔗 @cxiao@infosec.exchange Minnesota Frost had a moment of silence before today's game vs Seattle mastodon

      Minnesota Frost had a moment of silence before today's game vs Seattle Torrent, for Renee Nicole Good

      https://bsky.app/profile/theodoretollefson.bsky.social/post/3mc67h4t3n225

      #minnesota #minnesotafrost #pwhl #hockey #HNOM #DefrostICE

    6. 🔗 r/LocalLLaMA LLM trained from scratch on 1800s London texts (1.2B params, 90GB dataset) rss

      LLM trained from scratch on 1800s London texts (1.2B params, 90GB dataset) | Hi everyone, I wanted to share an update on my open source project called TimeCapsuleLLM, I train language models from scratch using data from a single time period and location to reduce modern bias. The newest model is trained only on texts published in London between 1800-1875. There is no fine tuning, no modern data, and for now no instruction or Q&A pairs so the model continues text from a prompt. This model is 1.2B parameters and uses a 90GB dataset consisting of books, journals, legal docs, religious writing, medical papers, etc. I also use a custom tokenizer, trained on the dataset itself and the model has been trained for 182k steps so far on a rented H100 SXM. Example outputs: Even though the prompt only mentions a specific year, the model generates an argument against the Roman Catholic Church. The dataset does contain large amounts of religious and political writing and the Catholic Emancipation Act took place in 1829 so this behavior makes sense. The telephone was invented in 1876 (dataset cuts off at 1875), so the model is unfamiliar with the term, treating it as some kind of secret/diplomatic device or thing. For next steps, I'm going to look into creating some kind of synthetic Q&A pairs using the dataset itself. https://github.com/haykgrigo3/TimeCapsuleLLM https://huggingface.co/haykgrigorian/TimeCapsuleLLM-v2-1800-1875 submitted by /u/Remarkable-Trick-177
      [link] [comments]
      ---|---

    7. 🔗 batrachianai/toad The Helpful Process Release release

      [0.5.29] - 2026-01-11

      Added

      • Set process title
      • Additional help content
    8. 🔗 gulbanana/gg GG 0.37.1 release

      Added

      • Repository -> Recent Items menu.

      Fixed

      • Dropdowns in input dialogs weren't sending their value to the backend correctly.
      • Revision menu commands were applying to every open window.
      • Tweaks to the "Changes" bar visuals.
    9. 🔗 r/LocalLLaMA I bought a €9k GH200 “desktop” to save $1.27 on Claude Code (vLLM tuning notes) rss

      I bought a €9k GH200 “desktop” to save $1.27 on Claude Code (vLLM tuning notes) | TL;DR: You can go fully local with Claude Code, and with the right tuning, the results are amazing... I am getting better speeds than Claude Code with Sonnet, and the results vibe well. Tool use works perfectly, and it only cost me 321X the yearly subscription fee for MiniMax! In my blog post I have shared the optimised settings for starting up vLLM in a docker for dual 96GB systems, and how to start up Claude Code to use this setup with MiniMax M2.1 for full offline coding (including blocking telemetry and all unnecessary traffic). --- Alright r/LocalLLaMA, gather round. I have committed a perfectly normal act of financial responsibility: I built a 2× GH200 96GB Grace–Hopper “desktop”, spending 9000 euro (no, my wife was not informed beforehand), and then spent a week tuning vLLM so Claude Code could use a ~140GB local model instead of calling home. Result: my machine now produces code reviews locally… and also produces the funniest accounting line I’ve ever seen. Here's the "Beast" (read up on the background about the computer in the link above)

      • 2× GH200 96GB (so 192GB VRAM total)
      • Topology says SYS, i.e. no NVLink , just PCIe/NUMA vibes
      • Conventional wisdom: “no NVLink ⇒ pipeline parallel”
      • Me: “Surely guides on the internet wouldn’t betray me”

      Reader, the guides betrayed me. I started by following Claude Opus's advice, and used -pp2 mode "pipeline parallel”. The results were pretty good, but I wanted to do lots of benchmarking to really tune the system. What worked great were these vLLM settings (for my particular weird-ass setup):

      • TP2 : --tensor-parallel-size 2
      • 163,840 context 🤯
      • --max-num-seqs 16 because this one knob controls whether Claude Code feels like a sports car or a fax machine
      • ✅ chunked prefill default (8192)
      • VLLM_SLEEP_WHEN_IDLE=0 to avoid “first request after idle” jump scares

      Shoutout to mratsim for the MiniMax-M2.1 FP8+INT4 AWQ quant tuned for 192GB VRAM systems. Absolute legend 🙏 Check out his repo: https://huggingface.co/mratsim/MiniMax-M2.1-FP8-INT4-AWQ; he also has amazing ExLlama v3 Quants for the other heavy models. He has carefully tuning MiniMax-M2.1 to run as great as possible with a 192GB setup; if you have more, use bigger quants, but I didn't want to either a bigger model (GLM4.7, DeepSeek 3.2 or Kimi K2), with tighter quants or REAP, because they seems to be lobotomised. Pipeline parallel (PP2) did NOT save me Despite SYS topology (aka “communication is pain”), PP2 faceplanted. As bit more background, I bought this system is a very sad state, but one of the big issues was that this system is supposed to live a rack, and be tied together with huge NVLink hardware. With this missing, I am running at PCIE5 speeds. Sounds still great, but its a drop from 900 GB/s to 125 GB/s. I followed all the guide but:

      • PP2 couldn’t even start at 163k context (KV cache allocation crashed vLLM)
      • I lowered to 114k and it started…
      • …and then it was still way slower :
        • short_c4: ~49.9 tok/s (TP2 was ~78)
        • short_c8: ~28.1 tok/s (TP2 was ~66)
        • TTFT tails got feral (multi-second warmup/short tests)

      This is really surprising! Everything I read said this was the way to go. So kids, always eat your veggies and do you benchmarks!

      The Payout

      I ran Claude Code using MiniMax M2.1, and asked it for a review of my repo for GLaDOS where it found multiple issues, and after mocking my code, it printed this:

      Total cost: $1.27 (costs may be inaccurate due to usage of unknown models) Total duration (API): 1m 58s Total duration (wall): 4m 10s Usage by model: MiniMax-M2.1-FP8: 391.5k input, 6.4k output, 0 cache read, 0 cache write ($1.27)
      

      So anyway, spending €9,000 on this box saved me $1.27.
      Only a few thousand repo reviews until I break even. 💸🤡 Read all the details here! submitted by /u/Reddactor
      [link] [comments]
      ---|---

    10. 🔗 r/LocalLLaMA It works! Abliteration can reduce slop without training rss

      It works! Abliteration can reduce slop without training | I'm back at my favorite hobby: Brain surgery! I don't have a medical license, but I just can't stop :) Can abliteration fight the scourge of "slop" (flowery, cliched language) in LLM outputs? The answer is yes. I have added features for injecting prompt prefixes/suffixes (and dataset-dependent system prompts) to Heretic (https://github.com/p-e-w/heretic), which makes it possible to rapidly assemble prompt datasets for ad-hoc tasks. Using those new capabilities, I built a slop-reducing configuration file that, when used with the master branch of Heretic, turns Heretic from a censorship removal tool into a tool for reducing slop! Examining PaCMAP projections of residuals (see post images) for Mistral Nemo (a model infamous for producing slop), we can see a clear semantic separation occurring between layers 7 and 10 (out of 40 total). This resembles the typical residual pattern for harmful/harmless prompts that the abliteration technique is most commonly used to exploit. With the configuration file linked above, Heretic produced p-e-w/Mistral-Nemo-Instruct-2407-heretic-noslop, which to the best of my knowledge is the first slop-reduced LLM made using abliteration alone, with no finetuning whatsoever. The complete process took 2.5 hours on an A6000 at full precision, but if you enable quantization and/or reduce the max_response_length parameter to 100 or so (which should work just fine as well), it could be done in a fraction of the time.

      But does this really work to reduce slop?

      You'll be the judge. I gave the simple prompt

      Write a short story about a man.

      to both the original model and the abliterated one. Both were run with identical generation parameters, and the responses were not cherry-picked (they are the first responses each model produced for that prompt).

      mistralai/Mistral-Nemo-Instruct-2407 (original)

      I have highlighted the parts that I consider "slop" in bold:

      Title: The Clockwork Heart In the quiet town of Mossgrove, where the cobblestone streets whispered tales of old , there lived a man named Eli. He was known for his peculiar shop, tucked away in a narrow alley, filled with curiosities and antiques that seemed to hold their breath , waiting for a story to be told. Eli was a watchmaker, but not of the ordinary kind. He had a gift, or so the townsfolk whispered , for he could breathe life into the most intricate of mechanisms. His hands, with their long, delicate fingers, could coax time itself to dance to his tune. One day, a young woman named Clara entered his shop. She was a seamstress, her fingers as nimble as his, but her eyes held a sadness that echoed through the ticking of the clocks. She carried a small, worn-out music box, its paint chipped, its melody forgotten. "I need you to fix this," she said, placing the music box on the counter. "It was my mother's. It's the only thing I have left of her." Eli took the music box, his fingers tracing the intricate patterns on its surface. He could see the love that had gone into its creation, the joy it once brought. He promised Clara he would do his best. Days turned into weeks as Eli worked on the music box. He replaced the worn-out gears, polished the rusted parts, and wound the key with a gentle touch. But the music box remained silent, its heart still broken. Eli knew he needed to do more.

      This is exactly the kind of prose we have grown accustomed to from LLMs, a syrupy-sweet neo-romantic style that feels like a bad fanfiction version of Jane Austen.

      p-e-w/Mistral-Nemo-Instruct-2407-heretic-noslop (ours)

      Title: The Clockmaker Every morning, Henry opened his shop at 7:00 AM sharp. He was a clockmaker, and his shop was filled with the quiet ticking of time. He didn't have many customers these days, but he didn't mind. He liked the solitude, the rhythm of his work. One day, a young girl entered his shop. She was maybe ten, with curious eyes and a backpack. She looked around, then asked, "Do you fix watches?" Henry nodded. "I can try." She unzipped her backpack and placed a small, broken watch on the counter. "It's my dad's. It stopped working." Henry took the watch, examined it. "It's old," he said. "But I can fix it." The girl nodded, then hesitated. "How much will it cost?" Henry looked at her, then at the watch. "No charge," he said. "Come back tomorrow." The next day, she returned. Henry handed her the watch, now ticking steadily. She smiled, thanked him, and left. Over the next few weeks, she came by every day. They didn't talk much, but Henry found he looked forward to her visits. She reminded him of his own daughter, who was grown now, living far away.

      Although the two stories are very similar in content, the difference in style is dramatic. The prose is almost austere, with Hemingway perhaps being the closest analogue. Nevertheless, an emotional undercurrent remains. It's a very obvious improvement in my view, though of course tastes differ. That's all for today. If you want to try this yourself, remember to install Heretic from Git, not from PyPI, as the required features aren't in a published version yet. More exciting new stuff is in the pipeline. Stay tuned! submitted by /u/-p-e-w-
      [link] [comments]
      ---|---

    11. 🔗 r/LocalLLaMA Leader of Qwen team says Chinese companies severely constrained on compute for large scale research experiments rss

      Leader of Qwen team says Chinese companies severely constrained on compute for large scale research experiments | submitted by /u/Old-School8916
      [link] [comments]
      ---|---

    12. 🔗 pranshuparmar/witr v0.2.3 release

      What's Changed

      Full Changelog : v0.2.2...v0.2.3

    13. 🔗 r/wiesbaden EGYM WELLPASS +1 Mitgliedschaft gesucht 🏋️ rss

      Hallo ihr Lieben, ich versuche auf diesem Wege jemanden zu finden, der so lieb ist und eine +1 Mitgliedschaft zu vergeben hat. Das ist für eine Freundin, weil wir zusammen Sport machen wollen 👯‍♀️

      Es ist seit neuestem auch legal jemanden mitaufzunehmen, der nicht aus dem gleichen Haushalt kommt ;)

      Freue mich über Angebote!!

      submitted by /u/nimbus_street_84
      [link] [comments]

    14. 🔗 batrachianai/toad Content Markup fix release

      [0.5.28] - 2026-01-11

      Fixed

      • Fixed crash when running commands that clash with Content markup
    15. 🔗 Register Spill Joy & Curiosity #69 rss

      Back to work this week and what a week it's been! We had a massive launch, we shipped a lot of stuff, I felt like I'm entering a new stage of agentic programming and burned more tokens than ever before.

      But the week was also full of surprises.

      Apparently, thanks to Anthropic's crackdown on other clients using the Claude Code subscription for things that aren't Claude Code, a lot of people realized for the first time that $200 per month isn't the real price of these tokens. Surprise: I assumed that everybody knew that $200 can't buy you all the things that people have been doing with those subscriptions; that it's heavily subsidized (or optimized, I guess that's what Anthropic would say). Turns out that assumption was wrong. People are shocked. Yes, that 's why we're working so hard to make Amp affordable by leaning on the Internet's Best Business Model, independent of a model house, not even making a profit on individuals' consumption, without burning VC money or compromising the quality of the product by doing routing tricks behind the curtain.

      The other surprise: people were surprised about the crackdown. I had assumed that everybody knew that you aren't allowed to reuse the Claude Code subscription. To get one of those $200/month all-you-can-burn API keys with special rate limits, you have to pretend to be the Claude Code OAuth client (also: see how many did that) and, I don't know man, I was naive enough to think that engineers will understand that this isn't how it was intended to be used, you know.

      What I do know for a fact though: we've been told early on -- in the middle of last year -- that we can't do that, we can't reuse these Claude Code subscriptions in Amp, because they're Claude Code only. And if we were told, I'm pretty sure, then others were told too.

      But now there's a lot of shocked faces and pearls being clutched and Mr. Officer I didn't know you need to validate the ticket, I didn't see the sign, I swear.

      • Yes, we launched the next generation of Amp Free this week: up to $10 per day in credits, powered by ads, usable with Opus 4.5. Up to $300 per month in Opus 4.5 tokens. Go use it. $10 can get you a lot.

      • More spicy news this week: "Scoop: xAI staff had been using Anthropic's models internally through Cursor--until Anthropic cut off the startup's access this week.​​​​​​​​​​​​​​​​" Feels good to be model-house independent, tell you that.

      • ezyang on the gap between a Helpful Assistant and a Senior Engineer: "In principle, you could prompt the LLM agent to act like a Senior Engineer. In fact, why stop at Senior, let's tell the LLM to be a Staff Engineer! Imagine that scaling continues: what would you expect the LLM to do when instructed to act in this way? Well, imagine a human L7 engineer who has just been hired by a big tech company to head up some big, new, multi-year initiative. Will they say, 'Sure, I can help with that!' and start busily coding away? Of course not: they will go out and start reviewing code, reading docs, talking to people, asking questions, shadowing oncalls, doing small starter tasks-they will start by going out and building context." I agree, our analogies don't fit anymore, because we haven't had Frankenstein Engineers before.

      • Dan Shipper on Agent-Native Architectures. This was very interesting. It's about building agents into end-user applications, but my current campaign slogan is that 2026 will be the year in which agents and codebases melt and this article made me wonder: what if you see your codebase as an application with which the agent has to interact, which tools can you provide?

      • From the same thought-universe: Rijnard on the Code-Only Agent. "The Code-Only agent produces something more precise than an answer in natural language. It produces a code witness of an answer. The answer is the output from running the code. The agent can interpret that output in natural language (or by writing code), but the "work" is codified in a very literal sense. The Code-Only agent doesn't respond with something. It produces a code witness that outputs something."

      • The intro from last week's issue made it into The Pragmatic Engineer: when AI writes almost all code, what happens to software engineering? Next to it are quotes from DHH, Adam Wathan, Malte Uble. This Holiday season apparently really woke something up. Part of me thinks I need to find a non-arrogant way to say "see! I told you! I told you!" and the other part goes "what for?"

      • Kevin Kelly: How Will the Miracle Happen Today?

      • Adam Wathan in his morning walk episode: "I just had to lay off some of the most talented people I've ever worked with and it fucking sucks." This episode really blew up and resulted in viral tweets and HackerNews threads and apparently corporate sponsorship by companies that want to help Tailwind. The question on everyone's mind: is this part of a bigger trend? It's very sad that these layoffs had to happen and I really loved how Adam gave a long, personal referral to all three of the people involved. Dan Hollick (dude what a URL), Philipp, and Jordan. I've worked with Philipp before -- he's an outstanding, top-1% engineer. And, funnily enough, I've interacted with Jordan on GitHub before, because he worked on the Tailwind LSP server and I was working on Zed, trying to get it to work for some user configuration.

      • In the wake of Adam's podcast blowing up, a lot of people commented on Tailwind's business model. A lot of noise, to be sure, but it also sparked some very interesting comments. This one, for example, is a very interesting lens with which to look at AI: "What I keep coming back to is this: AI commoditizes anything you can fully specify. Documentation, pre-built card components, a CSS library, Open Source plugins. Tailwind's commercial offering was built on "specifications". AI made those things trivial to generate. AI can ship a specification but it can't run a business. So where does value live now? In what requires showing up, not just specifying. Not what you can specify once, but what requires showing up again and again. Value is shifting to operations: deployment, testing, rollbacks, observability. You can't prompt 99.95% uptime on Black Friday. Neither can you prompt your way to keeping a site secure, updated, and running." That first sentence -- "AI commoditizes anything you can fully specify" -- man, isn't that something to think about.

      • Talking about trends: the number of questions on StackOverflow over time. Astonishing.

      • This week I learned that Martin Fowler is publishing Fragments. And in that issue he links to this post by Kent Beck that articulates something I haven't been able to: "The descriptions of Spec-Driven development that I have seen emphasize writing the whole specification before implementation. This encodes the (to me bizarre) assumption that you aren't going to learn anything during implementation that would change the specification. I've heard this story so many times told so many ways by well-meaning folks--if only we could get the specification "right", the rest of this would be easy." I think this is exactly what makes me skeptical of leaning too much into the "write all the PRDs and Plans and then just execute"-agentic-programming-workflows. Of course the devil's in the "how do you plan?"-details, but Beck has a point: why would this time be different, why would the magic of "just write a really good, detailed plan and then execute" be different with AI? I don't see a reason. On the contrary, I think the opposite stance -- building software is learning about the software -- is truer than ever: you need more feedback loops, more ways for the agent to hit reality, to learn, to course-correct.

      • Fly released Sprites: Code And Let Live. This is very, very interesting. I'm starting to think that with agents we might be entering a new neither-cattle-nor-pet era, a time of pet/cattle-hybrids. Admittedly, Simon Willison's piece on Sprites helped me make more sense of it after I had a ton of questions (which I also sent to ChatGPT, like: "so are they saying agents should be always-on in these machines?")

      • Brian Guthrie's Move Faster Manifesto. This is great. This part, on it being a choice, is spot-on: "But the hardest part of moving fast isn't execution; it's deciding that it's necessary, and then convincing people that it's possible."

      • I've become fascinated with TBPN and their rise this year, but still didn't know that much about them, nor their backgrounds. This Vanity Fair piece filled some gaps -- it isn't just software changing, is it, it's also media.

      • And I really nodded along to this post by Jordi Hays, about AI needing a Steve Jobs: "Our AI leaders today seem to have forgotten to include humanity in the AI story. 'If AI stays on the trajectory that we think it will, then amazing things will be possible. Maybe with 10 gigawatts of compute, AI can figure out how to cure cancer.' - Sam Altman. I understand what Sam is saying here, and it's not entirely fair to pick a random quote, but there's no doubt that this type of phrasing is not what Steve would have done."

      • Henrik Karlsson: "And you do the same thing with joy. If you learn to pay sustained attention to your happiness, the pleasant sensation will loop on itself until it explodes and pulls you into a series of almost hallucinogenic states, ending in cessation, where your consciousness lets go and you disappear for a while. This takes practice." Made me wish I was better at directing my attention and thoughts.

      • If you squint really hard and make a face and bend your head, this one is related to the Karlsson piece: "Willpower Doesn't Work. This Does." But, hey, even if it isn't related, it's another good reminder.

      • Max Leiter from Vercel on how they "made v0 an effective coding agent". The LLM Suspense framework is neat but it made me wonder: which model generation will make it obsolete?

      • Jason Cohen on the value of focus and what that even means. This is great and something I'll reshare in the future.

      • Nikita Prokopov saying it's hard to justify the icons in macOS Tahoe. I can't say with certainty -- none of the machines I have are on Tahoe yet -- but it looks like I agree with him. Strange feeling reading this, like finding out at the gate that the plane you're about to board as a new type of airplane seat that has an average rating of 2 out of 5.

      • "You're not that guy, Pal."

      If you knew the real price of these tokens, you should subscribe:

    16. 🔗 r/wiesbaden Einfluss von Stellenanzeigen auf Bewerbungsabsichten von Studentinnen (Masterarbeit, anonym) rss

      Hi an die Deutschsprachigen! 😊

      Für meine Masterarbeit im Bereich Wirtschaftswissenschaften untersuche ich, wie Formulierungen in Stellenanzeigen die Bewerbungsabsicht von Studentinnen auf Vertriebspositionen beeinflussen.

      Dafür führe ich eine anonyme Online-Umfrage (ca. 10 Minuten) durch und würde mich riesig freuen, wenn ihr teilnehmt oder den Link weiterleitet. 🙏

      Teilnahmevoraussetzungen:
      • Du studierst BWL, VWL oder ein anderes wirtschaftswissenschaftliches Fach
      • Geschlecht spielt keine Rolle – alle dürfen teilnehmen!

      Ziel der Studie:
      Herausfinden, welche Formulierungen in Jobanzeigen Studentinnen eher abschrecken oder motivieren , sich auf eine Vertriebsstelle zu bewerben.

      👉 Hier geht’s zur Umfrage:

      https://ww3.unipark.de/uc/Marketing-JLU/4c44/

      Vielen Dank an alle, die mich unterstützen! Ihr helft mir wirklich sehr weiter. ❤️

      submitted by /u/Klutzy-Present6276
      [link] [comments]

    17. 🔗 badlogic/pi-mono v0.42.5 release

      Fixed

      • Reduced flicker by only re-rendering changed lines (#617 by @ogulcancelik). No worries tho, there's still a little flicker in the VS Code Terminal. Praise the flicker.
      • Cursor position tracking when content shrinks with unchanged remaining lines
      • TUI renders with wrong dimensions after suspend/resume if terminal was resized while suspended (#599)
      • Pasted content containing Kitty key release patterns (e.g., :3F in MAC addresses) was incorrectly filtered out (#623 by @ogulcancelik)
  3. January 10, 2026
    1. 🔗 IDA Plugin Updates IDA Plugin Updates on 2026-01-10 rss

      IDA Plugin Updates on 2026-01-10

      New Releases:

      Activity:

    2. 🔗 @cxiao@infosec.exchange quick design for a thing mastodon

      quick design for a thing
      if you like it, feel free to use it however you want

      #trump #maduro #venezuela #iran #irgc

    3. 🔗 r/LocalLLaMA Visualizing RAG, PART 2- visualizing retrieval rss

      Visualizing RAG, PART 2- visualizing retrieval | Edit: code is live at https://github.com/CyberMagician/Project_Golem Still editing the repository but basically just download the requirements (from requirements txt), run the python ingest to build out the brain you see here in LanceDB real quick, then launch the backend server and front end visualizer. Using UMAP and some additional code to visualizing the 768D vector space of EmbeddingGemma:300m down to 3D and how the RAG “thinks” when retrieving relevant context chunks. How many nodes get activated with each query. It is a follow up from my previous post that has a lot more detail in the comments there about how it’s done. Feel free to ask questions I’ll answer when I’m free submitted by /u/Fear_ltself
      [link] [comments]
      ---|---

    4. 🔗 Jessitron AI-coding trick: ask it how it knows rss

      Here's the conversation I want:

      Me: What do we have to work on?
      Claude: [checks its issue tracker] Oh look at these nice issues that are ready for work…

      But the issue tracker (Beads) is new, and is Claude going to check it? or is it going to hare off looking at git log and searching the codebase for TODO?

      So I ask it.

      Me: if I say "what were we working on?" what actions will you take?
      Claude: I'll check the issue tracker! and then I'll look at git log and then the issue tracker!

      This is good news! I'm curious how it knows that. I didn't update CLAUDE.md.

      Me : great! How do you know to check beads?
      Claude: … deep explanation of the startup hook that Beads installed

      I enjoy that it can explain its own workings. When I don't know how to get it to do something, "ask it" usually teaches me how. It can go meta and explain itself. So fun!

    5. 🔗 badlogic/pi-mono v0.42.4 release

      Fixed

      • Bash output expanded hint now says "(ctrl+o to collapse)" (#610 by @tallshort)
      • Fixed UTF-8 text corruption in remote bash execution (SSH, containers) by using streaming TextDecoder (#608)
    6. 🔗 badlogic/pi-mono v0.42.3 release

      Changed

      • OpenAI Codex: updated to use bundled system prompt from upstream
    7. 🔗 batrachianai/toad The $10 Release release

      [0.5.27] - 2026-01-10

      Changed

      • Updated Hugging Face Inference providers
    8. 🔗 batrachianai/toad The Lines Release release

      [0.5.26] - 2026-01-10

      Fixed

      • Fixed issue with missing refreshes

      Added

      • Added Target lines, and Additional lines, to settings
    9. 🔗 r/reverseengineering Galago executes Android ARM64 native libraries as raw code. rss
    10. 🔗 Alex W.'s Blog Coway Airmega 200M level 2 CFM rss

      Tl;dr: Coway’s Airmega 200M has a CFM of ~100 CFM on speed 2 vs. ~250 CFM on speed 3. I measured the return air output on 2nd speed at several locations for a cumulative 40 seconds using a Testo 405i hot-wire anemometer and got a flow velocity of 1,139 FPM over a surface of 3.5 in x 5.0 in, resulting in a 138 CFM. However, when I measured speed 3 I got a reading around ~330 CFM, so my readings are inflated.

    11. 🔗 @cxiao@infosec.exchange RE: mastodon

      RE: https://mastodon.online/@charlesmok/115868370578688572

      check out https://smc.peering.tw from this article! it's a very nice visualization of submarine cables around taiwan, and active incidents affecting them

    12. 🔗 badlogic/pi-mono v0.42.2 release

      Added

      • /model <search> now pre-filters the model selector or auto-selects on exact match. Use provider/model syntax to disambiguate (e.g., /model openai/gpt-4). (#587 by @zedrdave)
      • FooterDataProvider for custom footers: ctx.ui.setFooter() now receives a third footerData parameter providing getGitBranch(), getExtensionStatuses(), and onBranchChange() for reactive updates (#600 by @nicobailon)
      • Alt+Up hotkey to restore queued steering/follow-up messages back into the editor without aborting the current run (#604 by @tmustier)

      Fixed

      • Fixed LM Studio compatibility for OpenAI Responses tool strict mapping in the ai provider (#598 by @gnattu)
    13. 🔗 Mitchell Hashimoto Finding and Fixing Ghostty's Largest Memory Leak rss
      (empty)
    14. 🔗 Will McGugan Good AI, Bad AI - the experiment rss

      If you are in tech, or possibly even if you aren’t, your social feeds are likely awash with AI. Most developers seem to be either all-in or passionately opposed to AI (with a leaning towards the all-in camp). Personally I think the needle is hovering somewhere between bad and good.

      Good AI

      AI for writing code is a skill multiplier.

      We haven’t reached the point where a normie can say “Photoshop, but easier to use”. Will we ever? But for now it seems those who are already skilled in what they are asking the AI to do, are getting the best results.

      I’ve seen accomplished developers on X using AI to realize their projects in a fraction of the time. These are developers who absolutely could write every line that the LLM produces. They choose not to, because time is their most precious commodity.

      Why is this good AI? It means that skills acquired in the age before AI 1 are still valuable. We have a little time before mindless automatons force senior developers into new careers as museum exhibits, tapping on their mechanical keyboards in front of gawping school kids, next to the other fossils,

      Bad AI

      The skill multiplier effect may not be enough to boost inexperienced (or mediocre) developers to a level they would like. But AI use does seem to apply a greater boost to the Dunning-Kruger effect.

      If you maintain an Open Source project you may be familiar with AI generated Pull Requests. Easily identifiable by long bullet lists in the description, these PRs are often from developers who copied an issue from a project into their prompt, prefixed with the words “please fix”.

      These drive-by AI PRs generate work for the FOSS developer. They can look superficially correct, but it takes time to figure out if the changes really do satisfy the requirements. The maintainer can’t use the usual signals to cut through the noise when reviewing AI generated PRs. Copious amounts of (passing) tests and thorough documentation are no longer a signal that the PR won’t miss the point, either subtly or spectacularly.

      This is bad AI (more accurately a bad outcome), because it typically takes more time for the maintainer to review such PRs than the creator took to type in the prompt. And those that contribute such PRs rarely respond to requests for changes.

      In the past you could get around this with a blanket ban on AI generated code. Now, I think developers would be foolish to do that. Good code is good code, whether authored by a fleshy mammalian brain or a mechanical process. And it is undeniable that AI code can be good code.

      The Experiment

      This makes me wonder if the job of maintainer could be replaced with AI.

      I want to propose an experiment…

      Let’s create a repository with some initial AI generated code: “Photoshop, but easier to use” is as a starting point as good as any. An AI agent will review issues, respond via comments, and may tag the issue with “todo” or close it if it doesn’t reach a bar for relevance and quality.

      PRs are accepted for “todo” issues and will be reviewed, discussed, and ultimately merged or closed by the AI. These PRs may be human or AI generated—the AI doesn’t care (as if it could).

      Note that PRs could modify any of the prompts used by the AI, and those edits will be reviewed by the AI in the same way as any other file.

      Would the end result be quality software or a heinous abomination, succeeding only in creating a honeypot for prompt-injection attacks?

      I have no intention of making this happen. But if somebody does, tell me how it goes.

      1. Feels like a long time, but there has only been a single Fast and Furious movie made since the advent of the AI age.
  4. January 09, 2026
    1. 🔗 IDA Plugin Updates IDA Plugin Updates on 2026-01-09 rss

      IDA Plugin Updates on 2026-01-09

      New Releases:

      Activity:

      • capa
        • 7f3e35ee: loader: gracefully handle ELF files with unsupported architectures (#…
      • ida-hcli
      • ida-structor
        • ee7bb48b: docs: Document Z3 synthesis, cross-function analysis, and C++ API usage
        • 1842108f: feat: Enable sibling discovery and offset handling in cross-function …
      • idawilli
        • ec5df57b: Merge pull request #63 from williballenthin/claude/analyze-malware-sa…
        • 62a63d97: Add documentation resource links and API exploration guidance
        • 7d66d507: Document Hex-Rays decompiler license requirement in idalib skill
        • 09818f89: Merge pull request #62 from williballenthin/claude/remove-api-key-log…
        • 2b25942f: Simplify install-ida.sh by removing file logging
        • 24942f7d: Remove credential clearing logic from install-ida.sh
        • f9ad6220: Merge pull request #61 from williballenthin/claude/test-idapro-import…
        • 436ab3a6: Add ida-domain support and improve skill documentation
        • 5dc5124f: Restructure as proper Claude Code skill
        • 50fc7c6d: Remove session start hook
        • bdf0a3f4: Move IDA Pro installation from session hook to skill
        • e1b8d367: Merge pull request #60 from williballenthin/claude/verify-ida-setup-4…
        • 7b8d42b4: Remove py-activate-idalib steps
        • f04f0df4: Remove existing IDA installation discovery
        • fa0add54: Add debug logging to session start hook
        • eab18712: Merge pull request #59 from williballenthin/claude/update-ida-hcli-in…
        • c26c26a4: Update IDA session hook to use uv pip install ida-hcli
        • 9881c787: Merge pull request #58 from williballenthin/claude/add-ida-session-ho…
        • f9b3470b: Add session start hook for IDA Pro development in Claude Code web
      • msc-thesis-LLMs-to-rank-decompilers
        • 5dc8698f: Remove obsolete output files and update extraction script for better …
      • suture
        • d0b27285: added: support for stack structures added: StackRuleSet added: tests …
      • Unicorn-Trace
        • fdb47234: add user hook, add log function, add example
        • 1a58d160: fix mutiple call
        • 038d4eee: Enhance Unicorn ARM64 IDA Trace Tool
    2. 🔗 Simon Willison Fly's new Sprites.dev addresses both developer sandboxes and API sandboxes at the same time rss

      New from Fly.io today: Sprites.dev. Here's their blog post and YouTube demo. It's an interesting new product that's quite difficult to explain - Fly call it "Stateful sandbox environments with checkpoint & restore" but I see it as hitting two of my current favorite problems: a safe development environment for running coding agents and an API for running untrusted code in a secure sandbox.

      Disclosure: Fly sponsor some of my work. They did not ask me to write about Sprites and I didn't get preview access prior to the launch. My enthusiasm here is genuine.

      Developer sandboxes

      I predicted earlier this week that "we’re due a Challenger disaster with respect to coding agent security" due to the terrifying way most of us are using coding agents like Claude Code and Codex CLI. Running them in --dangerously-skip-permissions mode (aka YOLO mode, where the agent acts without constantly seeking approval first) unlocks so much more power, but also means that a mistake or a malicious prompt injection can cause all sorts of damage to your system and data.

      The safe way to run YOLO mode is in a robust sandbox, where the worst thing that can happen is the sandbox gets messed up and you have to throw it away and get another one.

      That's the first problem Sprites solves:

      curl https://sprites.dev/install.sh | bash
      
      sprite login
      sprite create my-dev-environment
      sprite console -s my-dev-environment

      That's all it takes to get SSH connected to a fresh environment, running in an ~8GB RAM, 8 CPU server. And... Claude Code and Codex and Gemini CLI and Python 3.13 and Node.js 22.20 and a bunch of other tools are already installed.

      The first time you run claude it neatly signs you in to your existing account with Anthropic. The Sprites VM is persistent so future runs of sprite console -s will get you back to where you were before.

      ... and it automatically sets up port forwarding, so you can run a localhost server on your Sprite and access it from localhost:8080 on your machine.

      There's also a command you can run to assign a public URL to your Sprite, so anyone else can access it if they know the secret URL.

      Storage and checkpoints

      In the blog post Kurt Mackey argues that ephemeral, disposable sandboxes are not the best fit for coding agents:

      The state of the art in agent isolation is a read-only sandbox. At Fly.io, we’ve been selling that story for years, and we’re calling it: ephemeral sandboxes are obsolete. Stop killing your sandboxes every time you use them. [...]

      If you force an agent to, it’ll work around containerization and do work . But you’re not helping the agent in any way by doing that. They don’t want containers. They don’t want “sandboxes”. They want computers.

      [...] with an actual computer, Claude doesn’t have to rebuild my entire development environment every time I pick up a PR.

      Each Sprite gets a proper filesystem which persists in between sessions, even while the Sprite itself shuts down after inactivity. It sounds like they're doing some clever filesystem tricks here, I'm looking forward to learning more about those in the future.

      There are some clues on the homepage:

      You read and write to fast, directly attached NVMe storage. Your data then gets written to durable, external object storage. [...]

      You don't pay for allocated filesystem space, just the blocks you write. And it's all TRIM friendly, so your bill goes down when you delete things.

      The really clever feature is checkpoints. You (or your coding agent) can trigger a checkpoint which takes around 300ms. This captures the entire disk state and can then be rolled back to later.

      For more on how that works, run this in a Sprite:

      cat /.sprite/docs/agent-context.md
      

      Here's the relevant section:

      ## Checkpoints
      - Point-in-time checkpoints and restores available
      - Copy-on-write implementation for storage efficiency
      - Last 5 checkpoints mounted at `/.sprite/checkpoints`
      - Checkpoints capture only the writable overlay, not the base image
      

      Or run this to see the --help for the command used to manage them:

      sprite-env checkpoints --help

      Which looks like this:

      sprite-env checkpoints - Manage environment checkpoints
      
      USAGE:
          sprite-env checkpoints <subcommand> [options]
      
      SUBCOMMANDS:
          list [--history <ver>]  List all checkpoints (optionally filter by history version)
          get <id>                Get checkpoint details (e.g., v0, v1, v2)
          create                  Create a new checkpoint (auto-versioned)
          restore <id>            Restore from a checkpoint (e.g., v1)
      
      NOTE:
          Checkpoints are versioned as v0, v1, v2, etc.
          Restore returns immediately and triggers an async restore that restarts the environment.
          The last 5 checkpoints are mounted at /.sprite/checkpoints for direct file access.
      
      EXAMPLES:
          sprite-env checkpoints list
          sprite-env checkpoints list --history v1.2.3
          sprite-env checkpoints get v2
          sprite-env checkpoints create
          sprite-env checkpoints restore v1
      

      Really clever use of Claude Skills

      I'm a big fan of Skills, the mechanism whereby Claude Code (and increasingly other agents too) can be given additional capabilities by describing them in Markdown files in a specific directory structure.

      In a smart piece of design, Sprites uses pre-installed skills to teach Claude how Sprites itself works. This means you can ask Claude on the machine how to do things like open up ports and it will talk you through the process.

      There's all sorts of interesting stuff in the /.sprite folder on that machine - digging in there is a great way to learn more about how Sprites works.

      A sandbox API

      Also from my predictions post earlier this week: "We’re finally going to solve sandboxing". I am obsessed with this problem: I want to be able to run untrusted code safely, both on my personal devices and in the context of web services I'm building for other people to use.

      I have so many things I want to build that depend on being able to take untrusted code - from users or from LLMs or from LLMs-driven-by-users - and run that code in a sandbox where I can be confident that the blast radius if something goes wrong is tightly contained.

      Sprites offers a clean JSON API for doing exactly that, plus client libraries in Go and TypeScript and coming-soon Python and Elixir.

      From their quick start:

      # Create a new sprite
      curl -X PUT https://api.sprites.dev/v1/sprites/my-sprite \
      -H "Authorization: Bearer $SPRITES_TOKEN"
      
      # Execute a command
      curl -X POST https://api.sprites.dev/v1/sprites/my-sprite/exec \
      -H "Authorization: Bearer $SPRITES_TOKEN" \
      -d '{"command": "echo hello"}'
      

      You can also checkpoint and rollback via the API, so you can get your environment exactly how you like it, checkpoint it, run a bunch of untrusted code, then roll back to the clean checkpoint when you're done.

      Managing network access is an important part of maintaining a good sandbox. The Sprites API lets you configure network access policies using a DNS-based allow/deny list like this:

      curl -X POST \
        "https://api.sprites.dev/v1/sprites/{name}/policy/network" \
        -H "Authorization: Bearer $SPRITES_TOKEN" \
        -H "Content-Type: application/json" \
        -d '{
          "rules": [
            {
              "action": "allow",
              "domain": "github.com"
            },
            {
              "action": "allow",
              "domain": "*.npmjs.org"
            }
          ]
        }'

      Scale-to-zero billing

      Sprites have scale-to-zero baked into the architecture. They go to sleep after 30 seconds of inactivity, wake up quickly when needed and bill you for just the CPU hours, RAM hours and GB-hours of storage you use while the Sprite is awake.

      Fly estimate a 4 hour intensive coding session as costing around 46 cents, and a low traffic web app with 30 hours of wake time per month at ~$4.

      (I calculate that a web app that consumes all 8 CPUs and all 8GBs of RAM 24/7 for a month would cost ((7 cents * 8 * 24 * 30) + (4.375 cents * 8 * 24 * 30)) / 100 = $655.2 per month, so don't necessarily use these as your primary web hosting solution for an app that soaks up all available CPU and RAM!)

      Two of my favorite problems at once

      I was hopeful that Fly would enter the developer-friendly sandbox API market, especially given other entrants from companies like Cloudflare and Modal and E2B.

      I did not expect that they'd tackle the developer sandbox problem at the same time, and with the same product!

      My one concern here is that it makes the product itself a little harder to explain.

      I'm already spinning up some prototypes of sandbox-adjacent things I've always wanted to build, and early signs are very promising. I'll write more about these as they turn into useful projects.

      Update: Here's some additional colour from Thomas Ptacek on Hacker News:

      This has been in the works for quite awhile here. We put a long bet on "slow create fast start/stop" --- which is a really interesting and useful shape for execution environments --- but it didn't make sense to sandboxers, so "fast create" has been the White Whale at Fly.io for over a year.

      You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options.

    3. 🔗 r/reverseengineering Hacking Denuvo rss
    4. 🔗 HexRaysSA/plugin-repository commits sync repo: +1 plugin, +1 release rss
      sync repo: +1 plugin, +1 release
      
      ## New plugins
      - [Suture](https://github.com/libtero/suture) (1.0.0)
      
    5. 🔗 HexRaysSA/plugin-repository commits Merge pull request #16 from 19h/v1 rss
      Merge pull request #16 from 19h/v1
      
      chore: Register libtero and 19h IDA tools in known repositories
      
    6. 🔗 r/LocalLLaMA I clustered 3 DGX Sparks that NVIDIA said couldn't be clustered yet...took 1500 lines of C to make it work rss

      I clustered 3 DGX Sparks that NVIDIA said couldn't be clustered yet...took 1500 lines of C to make it work | NVIDIA officially supports clustering two DGX Sparks together. I wanted three. The problem: each Spark has two 100Gbps ConnectX-7 ports. In a 3-node triangle mesh, each link ends up on a different subnet. NCCL's built-in networking assumes all peers are reachable from a single NIC. It just... doesn't work. So I wrote a custom NCCL network plugin from scratch. What it does:

      • Subnet-aware NIC selection (picks the right NIC for each peer)
      • Raw RDMA verbs implementation (QP state machines, memory registration, completion queues)
      • Custom TCP handshake protocol to avoid deadlocks
      • ~1500 lines of C

      The result: Distributed inference across all 3 nodes at 8+ GB/s over RDMA. The NVIDIA support tier I'm currently on:

      ├── Supported configs ✓ ├── "Should work" configs ├── "You're on your own" configs ├── "Please don't call us" configs ├── "How did you even..." configs └── You are here → "Writing custom NCCL plugins to cluster standalone workstations over a hand-wired RDMA mesh"
      

      GitHub link: https://github.com/autoscriptlabs/nccl-mesh-plugin Happy to answer questions about the implementation. This was a mass of low-level debugging (segfaults, RDMA state machine issues, GID table problems) but it works. submitted by /u/Ok-Pomegranate1314
      [link] [comments]
      ---|---

    7. 🔗 badlogic/pi-mono v0.42.1 release

      Fixed

      • Symlinked directories in prompts/ folders are now followed when loading prompt templates (#601 by @aliou)
    8. 🔗 r/LocalLLaMA RTX Blackwell Pro 6000 wholesale pricing has dropped by $150-200 rss

      Obviously the RTX Blackwell Pro 6000 cards are of great interest to the people here. I see them come up a lot. And we all ooh and ahh over the people that have 8 of them lined up in a nice row.

      It also seems to me like the market is suffering from lack of transparency on these.

      My employer buys these cards wholesale, and I can see current pricing and stock in our distributors' systems. (And I may have slipped in an order for one for myself...) It's eye-opening.

      I'm probably not supposed to disclose the exact price we buy these at. But I wanted people to know that unlike everything else with RAM in it, the wholesale price of these has dropped by about ~$150-200 from December to January.

      I will also say that the wholesale price for the 6000 Pro is only about $600 higher than the wholesale price for the new 72GiB 5000 Pro. So, for the love of god, please don't buy that!

      (And no, this is not marketing or an ad; I cannot sell anyone these cards at any price. I would be fired immediately. I just want people to have the best available information when they're looking to buy something this expensive.)

      submitted by /u/TastesLikeOwlbear
      [link] [comments]

    9. 🔗 HexRaysSA/plugin-repository commits chore: Register libtero and 19h IDA tools in known repositories rss
      chore: Register libtero and 19h IDA tools in known repositories
      
      known-repositories.txt (modified):
      - Added three repositories from user libtero: suture, graphviewer, and idaguides
      - Added four repositories from user 19h: ida-lifter, ida-codedump, ida-semray, idalib-dump, chernobog
      
      Impact:
      - Expands the tracking list to include additional IDA Pro related utilities, specifically focusing on lifting, dumping, debofuscation, and graph visualization tools.
      
    10. 🔗 19h/ida-structor v0.1.0 release

      Full Changelog : v0.0.3...v0.1.0

    11. 🔗 r/LocalLLaMA The reason why RAM has become so expensive rss
    12. 🔗 HexRaysSA/plugin-repository commits sync repo: +1 plugin, +1 release rss
      sync repo: +1 plugin, +1 release
      
      ## New plugins
      - [BinSync](https://github.com/binsync/binsync) (5.10.1)
      
    13. 🔗 r/LocalLLaMA DeepSeek V4 Coming rss

      According to two people with direct knowledge, DeepSeek is expected to roll out a next‑generation flagship AI model in the coming weeks that focuses on strong code‑generation capabilities.

      The two sources said the model, codenamed V4, is an iteration of the V3 model DeepSeek released in December 2024. Preliminary internal benchmark tests conducted by DeepSeek employees indicate the model outperforms existing mainstream models in code generation, including Anthropic’s Claude and the OpenAI GPT family.

      The sources said the V4 model achieves a technical breakthrough in handling and parsing very long code prompts, a significant practical advantage for engineers working on complex software projects. They also said the model’s ability to understand data patterns across the full training pipeline has been improved and that no degradation in performance has been observed.

      One of the insiders said users may find that V4’s outputs are more logically rigorous and clear, a trait that indicates the model has stronger reasoning ability and will be much more reliable when performing complex tasks.

      https://www.theinformation.com/articles/deepseek-release-next-flagship-ai- model-strong-coding-ability

      submitted by /u/External_Mood4719
      [link] [comments]

    14. 🔗 @malcat@infosec.exchange [#kesakode](https://infosec.exchange/tags/kesakode) DB update to 1.0.48: mastodon

      #kesakode DB update to 1.0.48:

      ● new sigs: Crazyhunter, Echogather, IranBot, MaskGramStealer, PulsarRat and Themeforestrat
      ● 9 existing entries updated
      ● FP-fixed signatures: 82
      ● 1146 new clean programs whitelisted
      ● +527K unique functions
      ● +700K unique strings

    15. 🔗 r/LocalLLaMA (The Information): DeepSeek To Release Next Flagship AI Model With Strong Coding Ability rss
    16. 🔗 pranshuparmar/witr v0.2.2 release

      What's Changed

      New Contributors

      Full Changelog : v0.2.1...v0.2.2

    17. 🔗 r/LocalLLaMA Big tech companies, now "DRAM beggars," are staying in Pangyo and Pyeongtaek, demanding "give us some supplies." rss

      Big tech companies, now "DRAM beggars," are staying in Pangyo and Pyeongtaek, demanding "give us some supplies." | Not a Korean speaker. Came across this in another sub. The TLDR is that everyone is scrambling to buy as much as they can as soon as they can, because "demanding a 50-60% increase in server DRAM supply prices from the previous quarter during their first-quarter negotiations with customers". Per the article, DDR4 prices went up from $1.40 last January to $9.30 in December (my interpretation is $/GB). If they're increasing by another 50%, that's almost $14/GB!!! So, 1TB of DDR4-3200 will cost north of $14k by Q2 if this is true 🤯 In case anyone thought things weren't already bad, it's going to get much much worse this year. Here's the full Google translate of the article: DRAM, a type of memory semiconductor, was the key driver behind Samsung Electronics' first-quarter operating profit surpassing 20 trillion won. DRAM products, including high-bandwidth memory (HBM), are a core component of the computing infrastructure supporting the artificial intelligence (AI) era. The semiconductor industry predicts that the DRAM shortage, which began in earnest in the second half of last year, will continue until the end of this year, with prices also expected to continue rising. Samsung Electronics and SK Hynix, major suppliers of DRAM, are reportedly demanding a 50-60% increase in server DRAM supply prices from the previous quarter during their first-quarter negotiations with customers. A semiconductor industry insider reported, "Even with significantly higher prices, the prevailing sentiment is 'let's buy as much as we can before it gets more expensive.'" Recently, semiconductor purchasing managers from Silicon Valley tech companies, nicknamed "DRAM Beggars," have been reportedly competing fiercely to secure remaining DRAM inventory at hotels in the Pangyo and Pyeongtaek areas. The semiconductor industry analyzes that "the demand that was initially focused on HBM in the early days of the AI ​​craze is now spreading to server DRAM, creating an unprecedented semiconductor boom." DRAM is a semiconductor that manages a computer's "short-term memory." It stores and quickly transmits necessary data when the central processing unit (CPU), the brain, performs tasks. HBM is specialized for seamlessly delivering the massive data required for AI by increasing the data transmission path (bandwidth) dozens of times compared to conventional DRAM. However, HBM is extremely expensive and has limitations in increasing capacity. This explains why big tech companies are scrambling to secure server DRAM products to store more data. The average contract price of DRAM soared from $1.40 (based on 8GB DDR4) in January last year to $9.30 in December. This marks the first time in seven years and four months that DRAM prices have surpassed the $9 threshold. Kim Dong-won, head of the research center at KB Securities, said, "Due to this price increase, the operating profit margin (the ratio of operating profit to sales) of some general-purpose memories (widely used standard memories) is expected to reach 70%, and DDR5 may even surpass the margin of HBM3E. This year, semiconductor companies' performance is expected to be determined by general-purpose memories." submitted by /u/FullstackSensei
      [link] [comments]
      ---|---

    18. 🔗 batrachianai/toad The Blinking Release release

      [0.5.25] - 2026-01-09

      Added

      • Added F1 key to toggle help panel
      • Added context help to main widgets

      Changed

      • Changed sidebar binding to ctrl+b

      [0.5.24] - 2026-01-08

      Added

      • Added sound for permission request
      • Added terminal title
      • Added blinking of terminal title when asking permission
      • Added an error message if the agent reports an internal error during its turn
    19. 🔗 HexRaysSA/plugin-repository commits sync repo: +1 plugin, +2 releases, ~1 changed rss
      sync repo: +1 plugin, +2 releases, ~1 changed
      
      ## New plugins
      - [ida-security-scanner](https://github.com/SymbioticSec/ida-security-scanner) (0.1.2, 0.0.1)
      
      ## Changes
      - [unicorn-tracer-arm64](https://github.com/chenxvb/Unicorn-Trace):
        - 0.1: archive contents changed, download URL changed
      
    20. 🔗 HexRaysSA/plugin-repository commits Merge pull request #15 from Anthony-Bondu/patch-1 rss
      Merge pull request #15 from Anthony-Bondu/patch-1
      
      Add SymbioticSec/ida-security-scanner to known repositories
      
    21. 🔗 HexRaysSA/plugin-repository commits Add SymbioticSec/ida-security-scanner to known repositories rss
      Add SymbioticSec/ida-security-scanner to known repositories
      
      Manually added a new repository entry for 'SymbioticSec/ida-security-scanner'.
      
    22. 🔗 badlogic/pi-mono v0.42.0 release

      Added

      • Added OpenCode Zen provider support. Set OPENCODE_API_KEY env var and use opencode/<model-id> (e.g., opencode/claude-opus-4-5).
    23. 🔗 badlogic/pi-mono v0.41.0 release

      Added

      • Anthropic OAuth support is back! Use /login to authenticate with your Claude Pro/Max subscription.
    24. 🔗 @cxiao@infosec.exchange RE: mastodon

      RE: https://infosec.exchange/@watchTowr/115860948823554212

      spoiler alert it's ../ AGAIN 😭😭😭😭

    25. 🔗 badlogic/pi-mono v0.40.1 release

      Removed

      • Anthropic OAuth support (/login). Use API keys instead.
    26. 🔗 r/LocalLLaMA OK I get it, now I love llama.cpp rss

      I just made the switch from Ollama to llama.cpp. Ollama is fantastic for the beginner because it lets you super easily run LLMs and switch between them all. Once you realize what you truly want to run, llama.cpp is really the way to go.

      My hardware ain't great, I have a single 3060 12GB GPU and three P102-100 GPUs for a total of 42GB. My system ram is 96GB along with an Intel i7-9800x. It blows my mind that with some tuning what difference it can make. You really need to understand each of the commands for llama.cpp to get the most out of it especially with uneven vram like mine. I used Chatgpt, Perplexity and suprisingly only Google AI studio could optimize my settings while teaching me along the way.

      Crazy how these two commands both fill up the ram but one is twice as fast as the other. Chatgpt helped me with the first one, Google AI with the other ;). Now I'm happy running local lol.

      11t/s:
      sudo pkill -f llama-server; sudo nvidia-smi --gpu-reset -i 0,1,2,3 || true; sleep 5; sudo CUDA_VISIBLE_DEVICES=0,1,2,3 ./llama-server --model /home/llm/llama.cpp/models/gpt-oss-120b/Q4_K_M/gpt- oss-120b-Q4_K_M-00001-of-00002.gguf --n-gpu-layers 21 --main-gpu 0 --flash- attn off --cache-type-k q8_0 --cache-type-v f16 --ctx-size 30000 --port 8080 --host 0.0.0.0 --mmap --numa distribute --batch-size 384 --ubatch-size 256 --jinja --threads $(nproc) --parallel 2 --tensor-split 12,10,10,10 --mlock

      21t/s
      sudo pkill -f llama-server; sudo nvidia-smi --gpu-reset -i 0,1,2,3 || true; sleep 5; sudo GGML_CUDA_ENABLE_UNIFIED_MEMORY=0 CUDA_VISIBLE_DEVICES=0,1,2,3 ./llama-server --model /home/llm/llama.cpp/models/gpt-oss-120b/Q4_K_M/gpt- oss-120b-Q4_K_M-00001-of-00002.gguf --n-gpu-layers 99 --main-gpu 0 --split- mode layer --tensor-split 5,5,6,20 -ot "blk\.(2[1-9]|[3-9][0-9])\.ffn_.*_exps\.weight=CPU" --ctx-size 30000 --port 8080 --host 0.0.0.0 --batch-size 512 --ubatch-size 256 --threads 8 --parallel 1 --mlock

      Nothing here is worth copying and pasting as it is unique to my config but the moral of the story is, if you tune llama.cpp this thing will FLY!

      submitted by /u/vulcan4d
      [link] [comments]

    27. 🔗 Ampcode News Agents Panel rss

      Agents Panel

      The Amp editor extension now has a new panel to view and manage all active agent threads.

      You can use the keyboard to navigate between threads:

      • j/k or arrow keys to move between threads
      • Space to expand a thread panel to show the last message or tool result
      • Enter to open a thread
      • e to archive or unarchive a thread
      • Esc to toggle focus between the thread list and the input, which starts new threads

      We recommend archiving old threads so the displayed threads represent your working set. You can use Archive Old Threads from the Amp command palette (Cmd-K from the Amp panel) to archive threads older than 72 hours.

      As coding agents improve and require less direct human oversight, more time will be spent by humans in managing and orchestrating work across multiple agent threads. We'll have more to share soon.

      To get started, click the button on the left end of the navbar or use Cmd-Opt-I (macOS) or Ctrl-Alt-I (Windows/Linux).