🏡


to read (pdf)

  1. As Rocks May Think | Eric Jang
  2. Doing the thing is doing the thing
  3. Reframing Agents
  4. How to Choose Colors for Your CLI Applications · Luna’s Blog
  5. A Protocol for Package Management | Andrew Nesbitt

  1. February 08, 2026
    1. 🔗 r/Leeds Attending Leeds Crown Court - Advice please rss

      I'm interested in visiting the Leeds Crown Court as a citizen and wondering if there is anyone on this sub that could give detailed advice on how it works, what my rights are to view public hearings and what to expect. Is it interesting, is it worth visiting, etc. Thankyou so much for your time.

      submitted by /u/BenjiD123
      [link] [comments]

    2. 🔗 r/reverseengineering joshuanwalker/Raiders2600: Reverse Engineering Raiders of the Lost Ark for the Atari 2600 rss
    3. 🔗 r/york Best parmo in York? rss

      I’ve lived here 7 years and have not yet had a good parmo. I miss them. Most of the time I’ve had one they’ve either been cold, cut into weird chunks or had underdone cheese/goopy bechamel/soggy chicken.

      Help a former Teessider out!

      submitted by /u/Autoembourgeoisement
      [link] [comments]

    4. 🔗 gulbanana/gg GG 0.38.1 release

      Fixed

      • XDG_CONFIG_HOME wasn't being used to look up global gitignores.
    5. 🔗 r/Leeds Food help if anyone can rss

      Being a bit down on my luck recently and have come into some money problems and I've been left without, I don't want to post this but I don't have a choice, anyone know of anywhere anyone who can help me with some food in not looking for a lot I have just never had to do this and I don't know where to turn

      submitted by /u/Own-Situation7386
      [link] [comments]

    6. 🔗 r/Yorkshire Had to stop and brew up this morning rss
    7. 🔗 r/LocalLLaMA PR opened for Qwen3.5!! rss

      PR opened for Qwen3.5!! | https://github.com/huggingface/transformers/pull/43830/ Looking at the code at src/transformers/models/qwen3_5/modeling_qwen3_5.py, it looks like Qwen3.5 series will have VLMs right off the bat! submitted by /u/Mysterious_Finish543
      [link] [comments]
      ---|---

    8. 🔗 Register Spill Joy & Curiosity #73 rss

      A year ago, on this very newsletter, I wondered: how might AI change programming?

      Here are some of the questions I asked in that post:

      " Will we write docstrings at the top of files that aren't meant to be read by humans, but by LLMs when they ingest the file into their context window?"

      " Will we see a melting of language servers and LLMs?"

      " What will change once we start to optimize code and processes around code purely for the reader, because the writer's a machine?"

      " Will we change how we modularize code and switch to writing many smaller programs because they're easier for LLMs to digest than large codebases?"

      It's been a year and now most of these questions sound naive to me. Of course we'll write documentation for agents, language servers seem dead, and absolutely one hundred percent are we optimizing code for readability over writability, except that now the reader is also an agent. And small programs? Yes, we're all optimizing codebases for the agents now.

      Here's a little anecdote for you, to show what happened in a year.

      On Tuesday, I was on a call with Tim and Camden to discuss something about our new architecture, and they suggested that we use UUIDs everywhere. Hmm, I don't know, UUIDs aren't a silver bullet you know, they do come with downsides, I said. But we don't have those downsides, they said, because our tables are literally a few hundred rows in this setup. Right, right, I said, but UUIDs are kinda ugly and when you look at them they don't give you any insights.

      On Thursday, Tim then said: hey, didn't you just say on Raising An Agent that you need to optimize for agents, not for humans, even at the cost of human developer experience? And I don't remember what exactly I said in response, but it boiled down to: you'll see, and then I will say that I told you so, UUIDs are ugly.

      Then yesterday, on Saturday, I realized Tim's right. Who am I kidding. Agents will read far more UUIDs than I ever will in the future. I had an aesthetic objection to something I'll barely see. The agents, though, they will deal with the UUIDs and they love them.

      • We recorded another episode of Raising An Agent. Quinn and I talk about where the frontier of these coding agents is moving to, why we are going to kill the Amp editor extension and why we don't think the sidebar nor the text editor are the future, and, finally, we talk about how wild it is to build in AI land and how every playbook software companies had in the last twenty, thirty years is now outdated. The only winning move now is to accept that the board will be flipped at random intervals. It's 55 minutes long and a condensed version of what I'd tell you this evening if you and I went out for beers.

      • Recorded another short video: "Is this the bet you want to take? While everything around us is changing?"

      • My colleague Lewis wrote a wonderful post about giving agents feedback: Feedback Loopable. There are so many good ideas in there: the arrow, the URL updating, the logs, the debug/REPL/CLI thing. Highly recommend it.

      • Hey, seriously, watch this talk: Rich Hickey - Simple Made Easy. I've linked to it before, I've tweeted about it many times, but this week I had to find out (and then digest and recover) that some of my colleagues hadn't seen it. So now I'm here and I'm telling you that this might very well be the greatest talk about programming ever given. I'm not kidding. I'm not exaggerating. I mean it. Not a week goes by in which I don't think of it. I'm rearchitecting a system now and when I close my eyes I can see Rich standing there, one hand on the podium, the other in the air, hanging down, and him saying "…and you end up with this knot." Go and watch the talk. Don't complect.

      • Martin Alderson: "Two kinds of AI users are emerging. The gap between them is astonishing." There's a lot of great stuff in there. The first point about people being stuck in Copilot is very interesting, isn't it? If your product is a text box, then it looks like all the other text boxes. But some text boxes have actual genies behind them and others don't. You, as a user, can't tell in advance. The other points he makes about enterprises shooting themselves in their feet with their security restrictions is very interesting too.

      • Monday was my birthday and I got a fantastic gift: the Xteink X4! Yes, it's a tiny, tiny e-reader. My mini-review, after having not read at all on it this week yet: very light, very small, very fun -- the software seems unfinished, it feels a bit hacky, it's a bit of a pain in the ass to transfer files to it, but there are a lot of articles and browser extensions on how to get the most out of it, there are also custom wallpapers, and an open-source firmware you can flash on it, and people are using their agents to write scripts for it, and I had Amp clone and extend the Send to X4 browser extension for me so that it fixes some broken epub formatting. Fun!

      • Talking about text boxes, here's Julian Lehr, Creative Director at Linear, with his case against conversational interfaces.

      • Mitchell: My AI Adoption Journey. "Through this journey, I've personally reached a point where I'm having success with modern AI tooling and I believe I'm approaching it with the proper measured view that is grounded in reality. I really don't care one way or the other if AI is here to stay, I'm a software craftsman that just wants to build stuff for the love of the game. The whole landscape is moving so rapidly that I'm sure I'll look back at this post very quickly and laugh at my naivete." Great post.

      • And here's DHH, roughly 6 weeks after I interviewed him and couldn't get a word in when he said that he doesn't believe in the hype and that agents can't write code he likes, telling his employees how to use agents.

      • Fantastic blog post: A Broken Heart. Read it, I swear you won't regret it. Great writing, great bug, great debugging. And -- you might not even notice, because of how calmly it's woven into the rest -- great use of agents.

      • Brendan Gregg is joining OpenAI. What a gig for him! There's very few places in the world right now where the relationship between performance and business value is as big as it is there.

      • Also: Yehuda Katz is joining Vercel to work on v0. The next big framework programmer going to build developer tooling with AI. Because that's where the leverage is.

      • But then here's Jose Valim, another big framework guy but one who turned into language guy, explaining why he thinks Elixir is the best language for AI. I respect Valim immensely, he's one of this generation's greatest programmers, but I couldn't help reading this and thinking: does it matter? doc strings? As if GPT-5.2 wasn't a thing. The point with the tooling stands though. Remember when some languages flipped how they print stack traces so that the most important line is printed last, so that the developer reading them in the terminal can immediately see it without scrolling up? What's the equivalent for agents going to be?

      • And here's someone arguing that the age of frameworks is over, but that software engineering ("the true one") is back: "Automation and boilerplating have never been so cheap to overcome. I've been basically never writing twice the same line of code. I'm instantly building small tools I need, purpose built, exactly shaped around the problem at hand. I don't need any fancy monorepo manager. A simple Makefile covers 100% of my needs for 99% of my use cases. When things will get very complicated, and if they get very complicated, I'll think about it. But only then. Not a second before. This is engineering. You solve the problem you have, not the problem someone on a conference stage told you that you'll eventually have." I agree that agents solve many of the same problems that frameworks are solving, but the overlap isn't 100%. Frameworks will continue to be around but look vastly different in a few years.

      • Related: Start all of your commands with a comma. This seems very smart and while I don't have that much in my ~/bin, I'm intrigued. But I'm also wondering: won't the agents think it's a typo? Won't they get it wrong at least once every time they try to run a command. You know, as if they were trying to plug a USB-A thing in.

      • So, John Collison and Dwarkesh Patel interviewed Elon Musk and two of them drank Guinness. Now, I'm aware that by linking to this episode I risk receiving angry letters telling me that I shall not promote Musk and by linking to a conversation with him I endorse this and that. I'm aware, but I do think it's possible to listen to someone talk and find them interesting and providing food for thought without agreeing with them. That's what happened when I listened to this episode. I kept thinking about how crazy this is: data centers in space to generate tokens. Maybe it will actually happen? Wow. I also kept thinking about how Musk views problems and engineering challenges, and how he always wants to remove the next bottleneck, and how everything is a manufacturing question to him. Everything, as if he's in a game of Factorio. Building one thing isn't enough, to solve the problem you need to build the factory that builds the things. I do think that listening to this episode and reading the commentary around it is interesting, because energy and GPUs are at the heart of the transformation we're going through. It's also interesting because xAI is joining SpaceX and SpaceX is about to IPO and you have to wonder how much of this podcast is part of the IPO pitch.

      • "I miss thinking hard."

      • This tweet by Rasmus is worth reading. And so too is the reply by Protty (that's the Zig contributor, ex-TigerBeetle, hardcore hacker Protty). My personal, very boring take that's actually so boring that it often makes me wonder whether I might just not be smart enough to see what others apparently see: I don't think today's software is buggier than the software I used in 1998 or in 2002 or in 2010. I also don't think the software back then was better. What I do think is that the Lindy effect exists in software too and that's why Vim is something we should put in a shrine but not that all software from 1992 is great.

      • cdixon in 2013: what the smartest people do on the weekend is what everyone else will do during the week in ten years.

      • 2013, again, this time Jason Cohen: The Code is your Enemy. Prescient, right? I mean: "The weakness is the same as your strength as they often are: Your love of creation. You love to write clean, tested, scalable, extensible, beautiful code. You love converting 'JTBDs' into 960-wide artwork. You love developing an entire app in the browser against a scalable back-end. And because you love it, you do it. You wake up in the morning thinking about what you can make, not how you can sell. You open Visual Studio before you consult your to-do list because there's something you just need to tweak. You launch xterm before your CRM (if you even have one, which you don't) because the server was running just a tad slower than you'd expect and you want to paw through log files."

      • "Clawdbot is a boutique, nerdy project right now, but consider it as an underlying trend going forward: when the major consumer LLMs become smart and intuitive enough to adapt to you on-demand for any given functionality - when you'll eventually be able to ask Claude or ChatGPT to do or create anything on your computer with no Terminal UI - what will become of 'apps' created by professional developers? I especially worry about standalone utility apps: if Clawdbot can create a virtual remote for my LG television (something I did) or give me a personalized report with voice every morning (another cron job I set up) that work exactly the way I want, why should I even bother going to the App Store to look for pre-built solutions made by someone else? What happens to Shortcuts when any 'automation' I may want to carefully create is actually just a text message to a digital assistant away?" That's by Federico Viticci. I think he has programming chops, but I don't think he's worked as a software engineer and, well, now he's also seeing it: a lot of software is going to die in the next few years. Don't make the mistake and think that there'll be announcements or funerals.

      • Here's stevey with a very stevey but calm-and-reflective-stevey post about Anthropic, and the idea of a Golden Age that companies go through, and about a hundred other things too: The Anthropic Hive Mind. This is stevey at his best. And, coming back to what Viticci wrote, the closing paragraphs are very good: "If you have a strictly online or SaaS software presence, with no atoms in your product whatsoever, just electrons, then you are, candidly, pretty screwed if you don't pivot. I don't think there are any recipes for pivoting yet; this is all new, and it's all happening very fast. But there is a yellow brick road: spending tokens. This golden shimmering trail will lead your company gradually in the right direction. Your organization is going to have to learn a bunch of new lessons, as new bottlenecks emerge when coding is no longer the bottleneck. You need to start learning those bespoke organizational lessons early. The only way to know for sure that you're learning those lessons is if people are out there trying and making mistakes. And you can tell how much practice they're getting from their token spend." Here's my recipe for how to walk the yellow brick road, from December 2025. I'd update it to say: use deep mode in Amp. GPT-5.2 and GPT-5.3 -- that's the frontier now.

      • Wirth's Revenge. I really enjoyed this one. I don't agree with quite a few things in there but that's what made it stick with me and maybe I'll change my opinions because of it. Good stuff.

      • An invitation by Nolan Lawson to mourn our craft. "Someday years from now we will look back on the era when we were the last generation to code by hand. We'll laugh and explain to our grandkids how silly it was that we typed out JavaScript syntax with our fingers. But secretly we'll miss it."

      • Domenic Denicola: "But they haven't solved the need to plan and prioritize and project-manage. And by making even low-priority work addictive and engaging, there's a real possibility that programmers will be burning through their backlog of bugs and refactors, instead of just executing on top priorities faster. Put another way, while AI agents might make it possible for a disciplined team to ship in half the time, a less-disciplined team might ship following the original schedule, with beautifully-extensible internal architecture, all P3 bugs fixed, and several side projects and supporting tools spun up as part of the effort."

      • Nicholas Carlini at Anthropic "tasked Opus 4.6 using agent teams to build a C Compiler, and then (mostly) walked away." That's a milestone we'll think back to even next year, I'd say. But, of course, people have moved the goalposts out of the stadium already and are saying that the code the compiler produced is slower than GCC's at -O0. See you in the parking lot! But there's another interesting bit here, at the end: "So, while this experiment excites me, it also leaves me feeling uneasy. Building this compiler has been some of the most fun I've had recently, but I did not expect this to be anywhere near possible so early in 2026. The rapid progress in both language models and the scaffolds we use to interact with them opens the door to writing an enormous amount of new code. I expect the positive applications to outweigh the negative, but we're entering a new world which will require new strategies to navigate safely." Why do statements like these always sound so hollow when they come from people working at Anthropic?

      • Steven Sinofsky, who's seen quite a few platform and paradigm shifts from up close: "Death of Software. Nah." He's saying that "there will be more software than ever before. This is not just because of AI coding or agents building products or whatever. It is because we are nowhere near meeting the demand for what software can do." And "new tools will be created with AI that do new things." And also: "Finally, it is absolutely true that some companies will not make it. It is even true that in some very long time, longer than a career or generation, every company will be completely different or their product line and organization will have dramatically changed. This will not broadly happen on any investing timeline."

      • Jo Kristian Bergum with some very good thoughts on the future: "few things are worth building." The value of 10k lines of code is approaching $0, he says, and a lot of things will disappear along with the value these lines once held. "What survives? Systems that compress hard-won insights agents would have to rediscover at enormous token cost. Systems that operate on a cheaper substrate than inference. Systems that solve hard universal problems agents can't route around easily. Systems built for how agents actually work, not how we wish they worked." The point about the "cheaper substrate" is something I flip back and forth on. Let's see how it plays out.

      • David Crawshaw after "eight more months of agents": "I am having more fun programming than I ever have, because so many more of the programs I wish I could find the time to write actually exist. I wish I could share this joy with the people who are fearful about the changes agents are bringing. The fear itself I understand, I have fear more broadly about what the end-game is for intelligence on tap in our society. But in the limited domain of writing computer programs these tools have brought so much exploration and joy to my work."

      • Yesterday evening, to my great delight, I found out that there's a documentary on Netflix about The New Yorker's 100th anniversary. Why did no one tell me about this? Next time, please do. That's why I write this newsletter. But anyway: delightful and very good. Also, if you've never listened to it, I very often think of David Remnick's voice in this 2016 episode of the Longform podcast.

      • Now that's a headline: Notepad++ Hijacked by State-Sponsored Hackers. And here's a very interesting, very screenshot-heavy deep dive into how the attack works. But I want to read the New Yorker version of this. Who targets Notepad++? There has to be an amazing story behind this.

      If you also think programming in five years will look completely different than from what it is now, you should subscribe:

    9. 🔗 w00tzenheimer/d810-ng v0.3.0 release

      What's Changed

      Full Changelog : v0.2.0...v0.3.0

    10. 🔗 HexRaysSA/plugin-repository commits sync repo: +1 release rss
      sync repo: +1 release
      
      ## New releases
      - [DeepExtract](https://github.com/marcosd4h/DeepExtractIDA): 0.0.9
      
  2. February 07, 2026
    1. 🔗 IDA Plugin Updates IDA Plugin Updates on 2026-02-07 rss

      IDA Plugin Updates on 2026-02-07

      Activity:

      • DeepExtractIDA
        • dcd54b81: Update README with symbol download documentation, usage examples, and…
        • 6cda98e7: supress errors due to not accessible files during file enumeration
        • 68f9b37e: Refactor symbol download to run symchk per-file with concurrency thro…
        • 453beef6: Add automatic PDB symbol downloading via symchk before IDA analysis a…
      • dylib_dobby_hook
      • idawilli
        • e7df8f68: Merge pull request #84 from williballenthin/claude/install-ida-script…
        • 9ebe8efb: Rename ida-sandbox to ida-codemode-sandbox and remove monty submodule
        • 15caca9e: Extract ida-codemode-api package from ida-sandbox
        • 0efcf2bb: Add usage examples to all 28 sandbox function docstrings and public API
        • df97e770: Add pyproject.toml for ida-monty-sandbox wheel distribution
        • a37321a4: Add execute(), system_prompt(), and api_reference() for chat plugin i…
        • 67ead8f9: Add comprehensive README with script-writing guidance for ida-sandbox
        • 5e7016f7: add GHA workflow for ida-sandbox tests
        • 9c0a9945: remove all mocks, test against real IDA with shared test binary
        • 8bc952e1: add 22 new IDA function wrappers to ida-sandbox with full test coverage
        • 02faf3d8: add resource limits, type checking, and structured error handling to …
        • 310c013d: add ida-sandbox: Monty-based sandbox exposing IDA Pro analysis routines
      • PyAgent
        • fd634227: chore(deps): upgrade kubernetes to 35.0.0 and urllib3 to 2.6.3 to fix…
        • 755cf9b0: Merge test-branch into main resolving divergent add/add conflicts by …
        • b475c55a: Merge test-branch into main resolving conflicts with theirs
        • 4d22ea42: chore: save local working changes before merging PRs
        • e2949c3e: chore: rust fallbacks, remove LANDiscovery debug, fix logits_processo…
        • 95bcc730: chore(deps): add signxml and lxml to requirements
        • 73368f4d: chore: ignore src/external_candidates/ingested/ and common python art…
        • 2ee0efb8: chore(merge): accept deletions under src/external_candidates/ingested
        • 46cb789f: fix(network,stats): allow send-only LANDiscovery start; use math.fsum…
        • d3d48cd8: test: make collection cross-platform; skip auto-generated Windows-pat…
      • quokka
        • d0e2b432: Merge pull request #87 from quarkslab/dependabot/github_actions/actio…
    2. 🔗 r/reverseengineering Chinese cheap Wifi Cam 365Cam rss
    3. 🔗 navidrome/navidrome v0.60.2 release

      This release expands ListenBrainz integration with artist URLs and top/similar songs, adds OpenSubsonic readonly and validUntil properties for playlists, and includes several bug fixes for the UI, scanner, and plugin system.

      Added

      • Backend Features:

        • Add artist URL, top songs, and similar songs support to the ListenBrainz agent. (#4934 by @kgarner7)
        • API Features:

        • Add OpenSubsonic readonly and validUntil properties to playlists. (#4993 by @kgarner7)

        • Plugin Features:

        • Add CallRaw method to SubsonicAPI host function with support for binary responses. (#4982 by @deluan)

      Fixed

      • UI:

        • Fix Last.fm URL handling and Biographies rendering on artist page. (#4980 by @kgarner7)
        • Fix Nautiline theme font path. (#4983 by @borisrorsvort)
        • Scanner:

        • Preserve first line in parentheses in lyrics. (#4985 by @deluan)

        • Server:

        • Clean up Last.fm content by removing "Read more" links from descriptions and bios. (e11206f0e by @deluan)

        • Handle WASM runtime panics in gotaglib openFile function. (4e720ee93 by @deluan)

      Full Changelog : v0.60.0...v0.60.5

      Helping out

      This release is only possible thanks to the support of some awesome people!

      Want to be one of them?
      You can sponsor, pay me a Ko- fi, or contribute with code.

      Where to go next?

    4. 🔗 r/LocalLLaMA I trained a 1.8M params model from scratch on a total of ~40M tokens. rss

      I trained a 1.8M params model from scratch on a total of ~40M tokens. | Ok so I've been working & experimenting with my own simple architecture. I call it Strawberry. This is a very very small experimental model. It has 1.8M params and was trained on a dataset with ~9M tokens (~7M for training and ~2M for val). It model was trained on a batch size of 16 and context length of 256. Making the batch size in token counts to be 16*256 = 4096. Meaning the model saw 4096 tokens per step. It was trained for 10k steps meaning it trained on a total of 40M tokens. The dataset was manually scraped and cleaned. The dataset contain texts from wikipedia on various topics, personalities, games, movies, companies and more. It also contain texts fandoms of various games such as GTA, RDR, Last of Us, Mafia and all. The dataset also contains storylines, scripts and story dialogues of various games such as RDR 2, GTA 5, Cyperpunk 2077, Mafia The Old Country. It also contain transcripts of some of my favorite youtube videos and it also contain code from some of my personal code bases and other repos such as the Hazel Game Engine repo on github. I tried my best to keep the programming language scale limited to just Python, C#, C++ and JavaScript. The dataset also contains texts from several research papers, academic articles and blogs (mainly revolving around AI and LLMs in general). All of this made ~30M chars in total. After training for 10k steps the final train loss was around 3.5 and val loss was around 3.8. This is the exact config for the model: {"dataset": {"data_division": 0.8, "load_from_file": true, "path": "data/webtext.bin"}, "checkpoints": {"path": "bin/ck18", "interval": 1000, "create_checkpoints": true}, "model_hyperparams": {"vocab_size": 8192, "block_size": 256, "r_layer": 3, "n_layer": 2, "n_head": 6, "n_embd": 96, "n_qkv": 384, "n_ffn": 384}, "optimizer_hyperparams": {"eps": 1e-08, "beta1": 0.9, "beta2": 0.99, "weight_decay": 0.001, "use_muon": false, "momentum": 0.95}, "model_path": "bin/s1.strawberry", "encoder_path": "bin/cl8k.bin", "init_from": "scratch", "seed": "auto", "gradient_accumulation_steps": 1, "batch_size": 16, "max_iters": 10000, "eval_interval": 1000, "log_interval": 100, "eval_iters": 100, "decay_lr": true, "lr_decay_iters": 10000, "learning_rate": 0.002, "cooldown_frac": 0.2, "warmup_iters": 500, "min_lr": 0.0002} cl8k is a tokenizer from Andrej Karpathy's tokenizer video trained on the same dataset I explained above and then it was used to tokenize those ~30M chars into just ~9M toks. The idea for Strawberry and retention was that I wanted to explore whether the attention weights can be generated in-real time rather than being learned. That's why I implemented a "Retention" Mechanism. The retention mechanism generates "weights" based on your input which are then used in attention. The formulation is a little bit similar to standard linear attention formula. This system where the QKV weights are dynamically generated rather than being learned allows to increase the number of attention layers (or model depth) without increasing the number of parameters at all. However increasing the number of attention layers have a problem. If multiple attention layers are stacked on top of each other without any non-linearity such as FFN, then the performance can decline and the loss can get worse overtime. That's why I implemented a mini-ffn right after the attention calculation and right before the output projection of each attention layer. So, the weights of qkv, mini-ffn and output projection are generated and updated dynamically by the retention mechanism. I've two attention mechanisms.

      1. Linear Attention in this case Apple's AFT for global context.
      2. Standard MHA attention for local context. I'm also planning to experiment with mixture of attention experts approach where each attention expert will get different local window. I haven't implemented it yet cuz this model was too small so it didn't made sense to me but I'll implement it later. Mixture of Attention Experts that's why the SPDA version of attention class is called The Expert Abundance. Idk why but I like that name so I'm sticking with it.

      Currently I'm trying to optimize & improve the architecture more. So yeah. That's the entire thing. I'd love to know your views and opinions. submitted by /u/SrijSriv211
      [link] [comments]
      ---|---

    5. 🔗 r/LocalLLaMA Prompt injection is killing our self-hosted LLM deployment rss

      We moved to self-hosted models specifically to avoid sending customer data to external APIs. Everything was working fine until last week when someone from QA tried injecting prompts during testing and our entire system prompt got dumped in the response.

      Now I'm realizing we have zero protection against this. Traditional web application firewalls don't understand LLM-specific attacks. The model just treats malicious prompts like normal user input and happily complies.

      Has anyone actually solved prompt injection for production LLM apps? Not talking about basic input sanitization because adversarial prompts can be crafted to look completely normal.

      submitted by /u/mike34113
      [link] [comments]

    6. 🔗 pydantic/monty v0.0.4 2026-02-07 release

      What's Changed

      New Contributors

      Full Changelog : v0.0.3...v0.0.4

    7. 🔗 badlogic/pi-mono v0.52.8 release

      New Features

      • Emacs-style kill ring (ctrl+k/ctrl+y/alt+y) and undo (ctrl+z) in the editor input (#1373 by @Perlence)
      • OpenRouter auto model alias (openrouter:auto) for automatic model routing (#1361 by @yogasanas)
      • Extensions can programmatically paste content into the editor via pasteToEditor in the extension UI context. See docs/extensions.md (#1351 by @kaofelix)
      • pi <package> --help and invalid subcommands now show helpful output instead of failing silently (#1347 by @ferologics)

      Added

      • Added pasteToEditor to extension UI context for programmatic editor paste (#1351 by @kaofelix)
      • Added package subcommand help and friendly error messages for invalid commands (#1347 by @ferologics)
      • Added OpenRouter auto model alias for automatic model routing (#1361 by @yogasanas)
      • Added kill ring (ctrl+k/ctrl+y/alt+y) and undo (ctrl+z) support to the editor input (#1373 by @Perlence)

      Changed

      Fixed

      • Fixed temporary git package caches (-e <git-url>) to refresh on cache hits for unpinned sources, including detached/no-upstream checkouts
      • Fixed aborting retries when an extension customizes the editor (#1364 by @Perlence)
      • Fixed autocomplete not propagating to custom editors created by extensions (#1372 by @Perlence)
      • Fixed extension shutdown to use clean TUI shutdown path, preventing orphaned processes
    8. 🔗 Simon Willison How StrongDM's AI team build serious software without even looking at the code rss

      Last week I hinted at a demo I had seen from a team implementing what Dan Shapiro called the Dark Factory level of AI adoption, where no human even looks at the code the coding agents are producing. That team was part of StrongDM, and they've just shared the first public description of how they are working in Software Factories and the Agentic Moment:

      We built a Software Factory: non-interactive development where specs + scenarios drive agents that write code, run harnesses, and converge without human review. [...]

      In kōan or mantra form:

      • Why am I doing this? (implied: the model should be doing this instead)

      In rule form:

      • Code must not be written by humans
      • Code must not be reviewed by humans

      Finally, in practical form:

      • If you haven't spent at least $1,000 on tokens today per human engineer, your software factory has room for improvement

      I think the most interesting of these, without a doubt, is "Code must not be reviewed by humans". How could that possibly be a sensible strategy when we all know how prone LLMs are to making inhuman mistakes?

      I've seen many developers recently acknowledge the November 2025 inflection point, where Claude Opus 4.5 and GPT 5.2 appeared to turn the corner on how reliably a coding agent could follow instructions and take on complex coding tasks. StrongDM's AI team was founded in July 2025 based on an earlier inflection point relating to Claude Sonnet 3.5:

      The catalyst was a transition observed in late 2024: with the second revision of Claude 3.5 (October 2024), long-horizon agentic coding workflows began to compound correctness rather than error.

      By December of 2024, the model's long-horizon coding performance was unmistakable via Cursor's YOLO mode.

      Their new team started with the rule "no hand-coded software" - radical for July 2025, but something I'm seeing significant numbers of experienced developers start to adopt as of January 2026.

      They quickly ran into the obvious problem: if you're not writing anything by hand, how do you ensure that the code actually works? Having the agents write tests only helps if they don't cheat and assert true.

      This feels like the most consequential question in software development right now: how can you prove that software you are producing works if both the implementation and the tests are being written for you by coding agents?

      StrongDM's answer was inspired by Scenario testing (Cem Kaner, 2003). As StrongDM describe it:

      We repurposed the word scenario to represent an end-to-end "user story", often stored outside the codebase (similar to a "holdout" set in model training), which could be intuitively understood and flexibly validated by an LLM.

      Because much of the software we grow itself has an agentic component, we transitioned from boolean definitions of success ("the test suite is green") to a probabilistic and empirical one. We use the term satisfaction to quantify this validation: of all the observed trajectories through all the scenarios, what fraction of them likely satisfy the user?

      That idea of treating scenarios as holdout sets - used to evaluate the software but not stored where the coding agents can see them - is fascinating. It imitates aggressive testing by an external QA team - an expensive but highly effective way of ensuring quality in traditional software.

      Which leads us to StrongDM's concept of a Digital Twin Universe - the part of the demo I saw that made the strongest impression on me.

      The software they were building helped manage user permissions across a suite of connected services. This in itself was notable - security software is the last thing you would expect to be built using unreviewed LLM code!

      [The Digital Twin Universe is] behavioral clones of the third-party services our software depends on. We built twins of Okta, Jira, Slack, Google Docs, Google Drive, and Google Sheets, replicating their APIs, edge cases, and observable behaviors.

      With the DTU, we can validate at volumes and rates far exceeding production limits. We can test failure modes that would be dangerous or impossible against live services. We can run thousands of scenarios per hour without hitting rate limits, triggering abuse detection, or accumulating API costs.

      How do you clone the important parts of Okta, Jira, Slack and more? With coding agents!

      As I understood it the trick was effectively to dump the full public API documentation of one of those services into their agent harness and have it build an imitation of that API, as a self-contained Go binary. They could then have it build a simplified UI over the top to help complete the simulation.

      Update: DTU creator Jay Taylor posted some extra context about this on Hacker News sharing a key prompting strategy:

      I did have an initial key insight which led to a repeatable strategy to ensure a high level of fidelity between DTU vs. the official canonical SaaS services:

      Use the top popular publicly available reference SDK client libraries as compatibility targets, with the goal always being 100% compatibility.

      With their own, independent clones of those services - free from rate-limits or usage quotas - their army of simulated testers could go wild. Their scenario tests became scripts for agents to constantly execute against the new systems as they were being built.

      This screenshot of their Slack twin also helps illustrate how the testing process works, showing a stream of simulated Okta users who are about to need access to different simulated systems.

      Screenshot of a Slack-like interface titled "DTU Slack" showing a thread view (Thread — C4B9FBB97) with "Focus first" and "Leave" buttons. The left sidebar lists channels including # org-general (182), # general (0) (shared×2), # it-support (0), # channel-0002 (0) (shared×2), # channel-0003 (0) through # channel-0020 (0), # org-finance (1), and a DMs section with a "Start" button. A "Create" button appears at the top of the sidebar. The main thread shows approximately 9 automated introduction messages from users with Okta IDs (e.g. @okta-u-423438-00001, @okta-u-423438-00002, etc.), all timestamped 2025-11-12Z between 18:50:31 and 18:51:51. Each message follows the format "Hi team! I'm [Name], joining as Employee in general. Key skills: [fictional skill phrases]. Excited to contribute!" All users have red/orange "O" avatar icons.

      This ability to quickly spin up a useful clone of a subset of Slack helps demonstrate how disruptive this new generation of coding agent tools can be:

      Creating a high fidelity clone of a significant SaaS application was always possible, but never economically feasible. Generations of engineers may have wanted a full in-memory replica of their CRM to test against, but self-censored the proposal to build it.

      The techniques page is worth a look too. In addition to the Digital Twin Universe they introduce terms like Gene Transfusion for having agents extract patterns from existing systems and reuse them elsewhere, Semports for directly porting code from one language to another and Pyramid Summaries for providing multiple levels of summary such that an agent can enumerate the short ones quickly and zoom in on more detailed information as it is needed.

      StrongDM AI also released some software - in an appropriately unconventional manner.

      github.com/strongdm/attractor is Attractor, the non-interactive coding agent at the heart of their software factory. Except the repo itself contains no code at all - just three markdown files describing the spec for the software in meticulous detail, and a note in the README that you should feed those specs into your coding agent of choice!

      github.com/strongdm/cxdb is a more traditional release, with 16,000 lines of Rust, 9,500 of Go and 6,700 of TypeScript. This is their "AI Context Store" - a system for storing conversation histories and tool outputs in an immutable DAG.

      It's similar to my LLM tool's SQLite logging mechanism but a whole lot more sophisticated. I may have to gene transfuse some ideas out of this one!

      A glimpse of the future?

      I visited the StrongDM AI team back in October as part of a small group of invited guests.

      The three person team of Justin McCarthy, Jay Taylor and Navan Chauhan had formed just three months earlier, and they already had working demos of their coding agent harness, their Digital Twin Universe clones of half a dozen services and a swarm of simulated test agents running through scenarios. And this was prior to the Opus 4.5/GPT 5.2 releases that made agentic coding significantly more reliable a month after those demos.

      It felt like a glimpse of one potential future of software development, where software engineers move from building the code to building and then semi-monitoring the systems that build the code. The Dark Factory.

      Wait, $1,000/day per engineer?

      I glossed over this detail in my first published version of this post, but it deserves some serious attention.

      If these patterns really do add $20,000/month per engineer to your budget they're far less interesting to me. At that point this becomes more of a business model exercise: can you create a profitable enough line of products that you can afford the enormous overhead of developing software in this way?

      Building sustainable software businesses also looks very different when any competitor can potentially clone your newest features with a few hours of coding agent work.

      I hope these patterns can be put into play with a much lower spend. I've personally found the $200/month Claude Max plan gives me plenty of space to experiment with different agent patterns, but I'm also not running a swarm of QA testers 24/7!

      I think there's a lot to learn from StrongDM even for teams and individuals who aren't going to burn thousands of dollars on token costs. I'm particularly invested in the question of what it takes to have agents prove that their code works without needing to review every line of code they produce.

      You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options.

    9. 🔗 r/Leeds Where do people usually find good live music in Leeds? rss

      Hi everyone, I am trying to get more into the local live music scene in Leeds and wanted to hear where people usually go to find decent live music. Whether it is smaller gigs, acoustic nights, or regular live music at pubs and local venues, I would love to hear what places or nights people genuinely enjoy. I am not looking to promote anything, just hoping to discover more of what is happening around the city and support local musicians.

      submitted by /u/LuckyTreat8962
      [link] [comments]

    10. 🔗 r/reverseengineering Using Javascript & WebSockets to automate an MMO browser game ! rss
    11. 🔗 r/Yorkshire North Yorkshire Moors at Fen Bog rss
    12. 🔗 r/york Best Ploughman’s In & Around York? rss

      I love a good Ploughman’s lunch: a good selection of cheeses, pickles, chutney, crusty bread, apple, celery, etc.

      In your opinion, where can the best Ploughman’s lunch be found in York or within a half hour drive? And what makes it good?

      submitted by /u/Yorkie-Talkie
      [link] [comments]

    13. 🔗 r/reverseengineering Hexed - A fast, local-first, scriptable hex editor for modern file analysis rss
    14. 🔗 r/LocalLLaMA Nemo 30B is insane. 1M+ token CTX on one 3090 rss

      Been playing around with llama.cpp and some 30-80B parameter models with CPU offloading. Currently have one 3090 and 32 GB of RAM. Im very impressed by Nemo 30B. 1M+ Token Context cache, runs on one 3090, CPU offloading for experts. Does 35 t/s which is faster than I can read at least. Usually slow as fuck at this large a context window. Feed it a whole book or research paper and its done summarizing in like a few mins. This really makes long context windows on local hardware possible. The only other contender I have tried is Seed OSS 36b and it was much slower by about 20 tokens.

      submitted by /u/Dismal-Effect-1914
      [link] [comments]

    15. 🔗 w00tzenheimer/d810-ng v0.2.0 — CI, Cython Speedups & Test Reliability release

      Highlights

      • Cython speedups fully operational in CIc_dataflow.so loads and runs the fast constant propagation path
      • All 9 CI jobs passing : 4 unit-test, 3 speedups, 2 system-test
      • System tests survive IDA decompiler segfaults via pytest-forked
      • Zero test failures across pure-python and speedups modes (445 passed in speedups)

      CI & Build

      • Add system tests in Docker with GHCR image (idapro-tests-9.2)
      • Link Cython extensions against libida.so on Linux (libraries=["ida"])
      • Add LD_LIBRARY_PATH=/app/ida to Docker compose for runtime symbol resolution
      • Use pytest --forked to isolate segfaults in system tests
      • Fix GHCR auth, speedups build, and Docker configuration

      Cython Speedups

      • Robust fallback chain: catch TypeError alongside ImportError/AttributeError
      • Guard _fast_rewrite_instruction and _fast_transfer_single with try/except
      • Fix _fast_dataflow.py to set cy_* symbols to None on ImportError
      • Fix SUB_TABLE re-export in hexrays_helpers.py
      • Clean stale .so/.cpp build artifacts from source tree

      Test Improvements

      • Fix fake jump mock tests: replace wrong hardcoded opcodes (0x31/0x30) with real ida_hexrays.m_jz/m_jnz
      • Centralize segfault skip list in known_issues.py
      • Add Scanner.module_from_spec try/except for dlopen failures
      • Add onerror handler to pkgutil.walk_packages

      Known Issues (tracked for future work)

      • 5 functions segfault in IDA decompile_func (skipped via SEGFAULT_FUNCTIONS)
      • 2 dispatcher detection tests xfail (stack offset API mismatch)
      • 3 KNOWN_INCORRECT verification rules (skipped from Z3 verification)
    16. 🔗 r/york York, Photographed rss

      York, Photographed | A few more pics from around York I’ve taken recently, thought some of you folks may be interested! Nikon Z5 + 40mm F/2 submitted by /u/Sockless_Ninja
      [link] [comments]
      ---|---

    17. 🔗 r/Yorkshire Fae town rss

      some time ago I was driving through Yorkshire, took a traffic diversion, and found what can only be described as a fae town.

      I drove up what must have been a 40⁰ slope. it went on for like, half an hour, nearly standstill traffic. Two different cars pulled over, smoke leaking from their engines. cattle grid at the top, for cow that like to hike. I'm not kidding about the slope. the entire time I wondered how this was legal, and if I was going to tip over backwards. the steepest Road I have ever driven up.

      then we reached the town.

      I don't even know if I can articulate what was wrong with the town. it felt like driving through Hot Fuzz. there was bunting, and locals, but you felt, vicerally, that you did mot want to slow to a stop. like if you walked down any side street you'd find that it was all a wooden facade. it felt "haha, weird" for the first five minutes and then there was a mounting terror as the traffic crawled through a moment pressed in honey. I swear there was no English event going on. this was before the Queen died. bunting and staring locals. all with a smile on their face.

      I hit the end of the town. the road goes violently T junction. ahead is nothing but green moors. and a sign that advertises a "cave". it must have been legitimate. but fuck me if coming out of fae town and seeing a brown board inviting you to the "caves" wasn't genuinely, soul deep terrifying.

      I pulled out. I think I turned right. ended up on a road on the side of the mountain and to my left was a valley full of forest. could just see the green tops of trees. and a fuck huge factory, each story at least 30 foot tall, sticking out of the woods like a grey monolith.

      so... where was I? what is the reality behind that horribly uncanny place? it genuinly haunts me. please help

      Edit: Solved! It was Winnet's pass, Castleton, and then Hope Valley cement works (https://maps.app.goo.gl/9HSTyh9M5kzTfwn1A?g_st=ac - i think you'll agree it looks like a grey monolith)

      submitted by /u/Iron-Heretic
      [link] [comments]

    18. 🔗 Arne Bahlo I built a NAS rss

      Without going too deep into it, in light of our late-stage surveillance capitalism and political escalations, I do not want to depend on tech companies for my personal data, especially those from the United States.

      In lieu of that, I’ve decided to take matters into my own hands and build a NAS.

      Table of contents

      History

      I’ve been hosting my own music since 2024, and I moved my personal data to a hosted NextCloud on Hetzner shortly after. Cryptomator helped me end-to-end encrypt my files on the Hetzner servers, and while I would recommend this, the workflow is a bit convoluted—for example, on macOS it mounts a volume that is harder to access.

      The biggest problem for me was photos—I was using iCloud photos, and even though something like Immich get’s really close to the functionality I’m interested in, it’s not automatically end-to-end encrypted.

      Hardware

      Let’s talk about constraints:

      • No pre-built, closed-source solution (Synology, QNAP, etc.): If I do this, I’m going to do it the right way.
      • Ideally SSD-only: Using HDDs is cheap (see next point), but they’re more prone to fail and wayyy slower.
      • Enough power: Because I wanted to run apps like Immich or NextCloud, I needed a fast-enough processor and enough memory, at least 8 GB.
      • Reasonably priced: My budget was around 700 €.

      Things I considered:

      • Old QNAP stations: Some of these come with SATA slots that also support SSD, but ultimately the processors of the ones I looked at were too weak.
      • Framework/Intel NUC/whatever with an external NVMe/M.2 case for my storage SSDs: There aren’t a lot of external NVMe RAID cases out there that look trustworthy, plus it felt wrong to have them connected over USB; it’s probably a lot slower as well.

      At this point, I felt like there was no good solution, until I found the Intel NUC NUC6i7KYK, a.k.a. Skull Canyon. It has:

      • Intel Core i7 (6MB cache, up to 3.5 GHz)
      • 16 GB memory (at least the one I got on eBay)
      • 2 internal M.2 SSD slots

      This sounded perfect, and I got it used for around 200 €, with an internal 500 GB SSD already built in, which I used to test the setup.

      My plan is to then get 2× 4 TB SSDs and mirror them for maximum security. This is more than enough storage for our family for the next few years. Also have you looked at SSD prices lately? 100 €/TB is a good deal.

      System

      From my research, these were the three top options:

      • TrueNAS Community Edition (Scale): Open-source NAS system based on ZFS, with support for many apps using Docker.
      • Unraid: Closed source; supports ZFS; BTRFS and XFS; needs a license ($249 for Lifetime)
      • Handrolled: Build everthing myself using Arch/NixOS w/ Docker

      While Unraid looked promising, I did not want any closed-source software. I thought about hand-rolling, but since I wanted this setup to be as stable as possible and I have very little experience with ZFS, I eventually decided on TrueNAS Scale.

      One annoyance with TrueNAS1 was that the installation could not be on a drive used for data. After some research, I decided to install on an SSD in a SATA case connected via USB. Booting from an external drive is not great and should be avoided if possible, but this is a common setup for people building NAS systems using mini-computers.

      One problem I encountered was that initramfs failed to find the boot pool on boot because the (external) SSD was not discovered when it tried. I solved this problem by running this on the TrueNAS shell:

      midclt call system.advanced.update '{"kernel_extra_options": "rootdelay=20"}'
      

      This pauses the boot process for 20s, giving the USB controller time to set up the SSD.

      Security

      Security is all about attack vectors. My setup is meant to protect against theft and automated attacks. There is room to tighten this up even more, but this is more than good enough for me.

      ZFS encryption

      TrueNAS supports ZFS-native encryption. You can choose to encrypt the root pool, which I did; the encryption key is stored on the system itself for auto- unlock. This is great, but it offers no protection against the whole system being stolen.

      Because of this, I have a top-level dataset under tank called encrypted, which is encrypted with a password. This means the system boots, but no data (including app data) is accessible until I log in and decrypt the dataset.

      For this reason, I’m also not using the built-in backup solution; I don’t want the encryption key and password to be accessible on boot. More about backups further down.

      I disabled the text console without a password prompt to prevent someone with physical access to the running, decrypted system from extracting data.

      Headscale

      There’s no way I’m exposing this system to the public internet. This is far too dangerous, even if you know what you’re doing. I still want to be able to access all my data from on the go, so I set up Headscale, an open-source, self-hosted version of Tailscale and compatible with the Tailscale apps. With this, I can access my NAS from anywhere via a self-hosted VPN.

      SSL

      But what if a machine in the VPN or on my local Wi-Fi gets compromised? The attacker would be able to sniff all HTTP traffic, extract credentials, and access all my data. This is why we encrypt our HTTP traffic.

      For the TrueNAS system, I forward HTTP to HTTPS and use the self-signed certificate that comes pre installed.

      But most TrueNAS apps expose ports without SSL, which means we need a reverse proxy. While TrueNAS supports the option to expose ports for inter-container communication, each app gets its own Docker network, which means a reverse proxy can’t access other apps by default2. To fix this, I use Dragonify (which I forked for security reasons). It connects all apps to a single network, so I can use Nginx Proxy Manager3 to serve apps with SSL on a subdomain.

      I bought a domain solely for this setup and use deSEC with an account created specifically for this domain. This means when a token gets compromised, only that single domain is affected. But that token lives on the encrypted dataset, soo it’s unlikely.

      Backups

      Because I don’t want any encryption keys or access credentials on the system partition, I’m not using the native TrueNAS backup systems. After some research, I’ve landed on Backrest, a configuration interface for Restic.

      Here’s what Filippo Valsorda has to say about Restic encryption:

      “The design might not be perfect, but it’s good. Encryption is a first-class feature, the implementation looks sane and I guess the deduplication trade- off is worth it. So… I’m going to use restic for my personal backups.”

      For storage, I’m using Hetzner Object Storage, mostly because it’s environmentally friendly and hosted in the EU.

      Apps

      We already talked about Tailscale, Dragonify, Nginx Proxy Manager, and Backrest.

      Here are the apps I’m using

      Conclusion

      So that’s my setup. It works really well, and I’m happy. I’ll try to edit this post in six months to give an update.

      Am I missing something? Let me know: hey@arne.me

      Thanks to Eric & Jan for proof-reading this post and listening to my ramblings!

      1. Not sure if this restriction also applies to Unraid/other systems.

      2. Apparently this was possible with 24.04 (Dragonfish)

      3. I might switch to a boring Caddy configuration soon.

      4. E.g. syncing my Obsidian vault or Steamdeck screenshots

  3. February 06, 2026
    1. 🔗 IDA Plugin Updates IDA Plugin Updates on 2026-02-06 rss

      IDA Plugin Updates on 2026-02-06

      New Releases:

      Activity:

      • augur
      • d810-ng
        • decee2c1: fix: link Cython extensions against libida on Linux
        • b1105a55: fix: robust Cython fallback in constant propagation
        • 50e3c0dc: fix: skip/xfail all known system test failures for green CI
        • bfc65359: fix: use pytest -forked for system tests to survive segfaults
        • f0339856: fix: add tigress_minmaxarray to segfault skip list
        • a082111c: fix: wrap Scanner.module_from_spec in try/except, centralize segfault…
        • fab04fc6: fix: resolve CI system-test failures (imports, segfault skip, scanner)
        • 843f6ffd: chore(tools): scaffold package & vendoring dirs
        • f6806d46: chore: update plugin entry, packaging, ignores
        • e03f6b14: feat(unicode): Improve Unicode cleanup in ununicode script
        • eeacebbb: fix: Use static version in pyproject.toml to fix editable install
        • 48d14e6f: test: Skip switch_case_ollvm_pattern (segfault in decompile_func)
        • d6987475: fix(ci): Use PYTHONPATH instead of pip install -e for system tests
        • 2ccff1b1: fix(ci): Replace speedups import check with file listing
        • af639dcb: fix(ci): Install g++ in Docker for Cython compilation
        • d4a11159: fix(ci): Install setuptools before -no-build-isolation install
        • b1a728f0: fix(ci): Build speedups inside Docker container
        • d5e52298: fix(ci): Remove bare speedups import check
      • ghidra
        • ee0accc4: Merge branch 'GP-0_ryanmkurtz_PR-8924_jstasiak_docs'
        • 7650ae1a: Merge remote-tracking branch 'origin/patch'
        • 9460c49a: Merge remote-tracking branch 'origin/GP-6393_HeapSequenceIndirectDedup'
        • 8baadaba: GP-0: Changing default behavior of OptionChooser.getArgs() to return an
      • haruspex
      • hrtng
        • 442761b4: indirect branch/call deobfuscation:
      • ida-hcli
        • 0da0216b: remove idat invocations for version and platform detection
      • idasql
        • c954da42: fix: prevent MSVC config subdirectory for idasql output
        • f6771168: fix: output idasql.exe next to ida.exe in IDASDK bin
        • ee236246: feat: enable copilot provider in libagents
      • idawilli
        • 0c6e8ef1: Merge pull request #83 from williballenthin/claude/fix-ida-api-compat…
        • 4e8d3e8b: fix float fields rendered as hex integers in struct dissector
        • a1bb3a07: fix IDA API compatibility: use 4-arg generate_disassembly for IDA <9.2
        • 7789bd55: Merge pull request #82 from williballenthin/claude/add-struct-dissect…
        • 39dc5959: use hcli tag format for IDA versions in CI workflows
        • 436163f6: support IDA 9.0+ in struct dissector tests and CI
        • dcb834e6: add tests and CI for global struct dissector
      • iOS-Study
      • msc-thesis-LLMs-to-rank-decompilers
        • f3db2500: some corrections and probably finished background and related
      • PatternGen
      • PyAgent
        • 6e4750db: Copilot/sub pr 29 another one (#33)
        • ae205c02: Merge origin/test-branch into main
        • a51ece78: Test branch (#28)
      • rhabdomancer
    2. 🔗 Simon Willison Running Pydantic's Monty Rust sandboxed Python subset in WebAssembly rss

      There's a jargon-filled headline for you! Everyone's building sandboxes for running untrusted code right now, and Pydantic's latest attempt, Monty, provides a custom Python-like language (a subset of Python) in Rust and makes it available as both a Rust library and a Python package. I got it working in WebAssembly, providing a sandbox-in-a-sandbox.

      Here's how they describe Monty:

      Monty avoids the cost, latency, complexity and general faff of using full container based sandbox for running LLM generated code.

      Instead, it let's you safely run Python code written by an LLM embedded in your agent, with startup times measured in single digit microseconds not hundreds of milliseconds.

      What Monty can do:

      • Run a reasonable subset of Python code - enough for your agent to express what it wants to do
      • Completely block access to the host environment: filesystem, env variables and network access are all implemented via external function calls the developer can control
      • Call functions on the host - only functions you give it access to [...]

      A quick way to try it out is via uv:

      uv run --with pydantic-monty python -m asyncio
      

      Then paste this into the Python interactive prompt - the -m asyncio enables top-level await:

      import pydantic_monty
      code = pydantic_monty.Monty('print("hello " + str(4 * 5))')
      await pydantic_monty.run_monty_async(code)

      Monty supports a very small subset of Python - it doesn't even support class declarations yet!

      But, given its target use-case, that's not actually a problem.

      The neat thing about providing tools like this for LLMs is that they're really good at iterating against error messages. A coding agent can run some Python code, get an error message telling it that classes aren't supported and then try again with a different approach.

      I wanted to try this in a browser, so I fired up a code research task in Claude Code for web and kicked it off with the following:

      Clone https://github.com/pydantic/monty to /tmp and figure out how to compile it into a python WebAssembly wheel that can then be loaded in Pyodide. The wheel file itself should be checked into the repo along with build scripts and passing pytest playwright test scripts that load Pyodide from a CDN and the wheel from a “python -m http.server” localhost and demonstrate it working

      Then a little later:

      I want an additional WASM file that works independently of Pyodide, which is also usable in a web browser - build that too along with playwright tests that show it working. Also build two HTML files - one called demo.html and one called pyodide-demo.html - these should work similar to https://tools.simonwillison.net/micropython (download that code with curl to inspect it) - one should load the WASM build, the other should load Pyodide and have it use the WASM wheel. These will be served by GitHub Pages so they can load the WASM and wheel from a relative path since the .html files will be served from the same folder as the wheel and WASM file

      Here's the transcript, and the final research report it produced.

      I now have the Monty Rust code compiled to WebAssembly in two different shapes - as a .wasm bundle you can load and call from JavaScript, and as a monty-wasm-pyodide/pydantic_monty-0.0.3-cp313-cp313-emscripten_4_0_9_wasm32.whl wheel file which can be loaded into Pyodide and then called from Python in Pyodide in WebAssembly in a browser.

      Here are those two demos, hosted on GitHub Pages:

      Screenshot of a web app titled "Monty via Pyodide" with description "Run Monty (a sandboxed Python interpreter by Pydantic) inside Pyodide (CPython compiled to WebAssembly). This loads the pydantic-monty wheel and uses its full Python API. Code is saved in the URL for sharing." A green banner reads "Code executed successfully!" Below are example buttons labeled "Basic", "Inputs", "Reuse", "Error Handling", "Fibonacci", and "Classes". A code editor labeled "Python Code (runs inside Monty sandbox via Pyodide):" contains: "import pydantic_monty\n\n# Create interpreter with input variables\nm = pydantic_monty.Monty('x + y', inputs=['x', 'y'])\n\n# Run with different inputs\nresult1 = m.run(inputs={"x": 10, "y": 20})\nprint(f"10 + 20 = {result1}")\n\nresult2 = m.run(inputs={"x": 100, "y": 200})" with "Run Code" and "Clear" buttons. The Output section shows "10 + 20 = 30" and "100 + 200 = 300" with a "Copy" button. Footer reads "Executed in 4.0ms".

      As a connoisseur of sandboxes - the more options the better! - this new entry from Pydantic ticks a lot of my boxes. It's small, fast, widely available (thanks to Rust and WebAssembly) and provides strict limits on memory usage, CPU time and access to disk and network.

      It was also a great excuse to spin up another demo showing how easy it is these days to turn compiled code like C or Rust into WebAssembly that runs in both a browser and a Pyodide environment.

      You are only seeing the long-form articles from my blog. Subscribe to /atom/everything/ to get all of my posts, or take a look at my other subscription options.

    3. 🔗 HexRaysSA/plugin-repository commits sync repo: ~1 changed rss
      sync repo: ~1 changed
      
      ## Changes
      - [IDASQL](https://github.com/allthingsida/idasql):
        - 0.0.1: archive contents changed
      
    4. 🔗 @binaryninja@infosec.exchange We're going live in just a few minutes! Join us to explore a couple new mastodon

      We're going live in just a few minutes! Join us to explore a couple new features! Including one from the API that we haven't yet looked at on stream... https://youtube.com/live/T4k2D-3z_SI

    5. 🔗 r/wiesbaden Super Bowl rss

      Does anyone know of any bars or restaurants in the area that will be showing the Super Bowl this Sunday night?

      submitted by /u/Grandpas_leftnut
      [link] [comments]

    6. 🔗 r/LocalLLaMA GLM 5 Is Being Tested On OpenRouter rss
    7. 🔗 r/Leeds Roundhay festival- scam? rss

      Back in September, they announced roundhay Festival is going to take place in July. They then announced lewis capaldi would headline the Saturday and then in October they announced pitbull and Kesha would be playing the Friday. But since then no other act has been announced. No supports for either days. The dates they put in the posts promoting the events are often wrong.

      I grew up in the area so this event intrigues me but these acts aren't selling it for me and have kept waiting for other acts to get announced as are most people i know.

      Is this Festival legit? Or is it sounding like a scam? Or is it just very badly run and organised?

      submitted by /u/Charlottebopp
      [link] [comments]

    8. 🔗 r/reverseengineering Dumping RAM to Recover Password of a Hikvision Camera rss
    9. 🔗 badlogic/pi-mono v0.52.7 release

      New Features

      • Per-model overrides in models.json via modelOverrides, allowing customization of built-in provider models without replacing provider model lists. See docs/models.md#per-model-overrides.
      • models.json provider models now merge with built-in models by id, so custom models can be added or replace matching built-ins without full provider replacement. See docs/models.md#overriding-built-in-providers.
      • Bedrock proxy support for unauthenticated endpoints via AWS_BEDROCK_SKIP_AUTH and AWS_BEDROCK_FORCE_HTTP1. See docs/providers.md.

      Breaking Changes

      • Changed models.json provider models behavior from full replacement to merge-by-id with built-in models. Built-in models are now kept by default, and custom models upsert by id.

      Added

      • Added modelOverrides in models.json to customize individual built-in models per provider without full provider replacement (#1332 by @charles-cooper)
      • Added AWS_BEDROCK_SKIP_AUTH and AWS_BEDROCK_FORCE_HTTP1 environment variables for connecting to unauthenticated Bedrock proxies (#1320 by @virtuald)

      Fixed

      • Fixed extra spacing between thinking-only assistant content and subsequent tool execution blocks when assistant messages contain no text
      • Fixed queued steering/follow-up/custom messages remaining stuck after threshold auto-compaction by resuming the agent loop when Agent-level queues still contain pending messages (#1312 by @ferologics)
      • Fixed tool_result extension handlers to chain result patches across handlers instead of last-handler-wins behavior (#1280)
      • Fixed compromised auth lock files being handled gracefully instead of crashing auth storage initialization (#1322)
      • Fixed Bedrock adaptive thinking handling for Claude Opus 4.6 with interleaved thinking beta responses (#1323 by @markusylisiurunen)
      • Fixed OpenAI Responses API requests to use store: false by default to avoid server-side history logging (#1308)
      • Fixed interactive mode startup by initializing autocomplete after resources are loaded (#1328)
      • Fixed modelOverrides merge behavior for nested objects and documented usage details (#1062)
    10. 🔗 r/LocalLLaMA [Release] Experimental Model with Subquadratic Attention: 100 tok/s @ 1M context, 76 tok/s @ 10M context (30B model, single GPU) rss

      Hey everyone,

      Last week I shared preliminary results on a new subquadratic attention mechanism (https://www.reddit.com/r/LocalLLaMA/comments/1qol3s5/preliminary_new_subquadratic_attention_20k_toks). Following up with the full release: model + inference code are now available.

      TL;DR : 30B model achieving O(L^(3/2)) scaling instead of O(L^2). Enables 1M–10M context on a single GPU with decode speeds that stay practical even at extreme context lengths. Ships with an OpenAI-compatible server and CLI to try out.

      - 🤗 Model : https://huggingface.co/concavity-ai/superlinear-exp-v0.1

      - 💻 Code : https://github.com/concavity-ai/superlinear (pip install superlinear)

      - 📄 Paper : https://arxiv.org/abs/2601.18401

      Main Idea

      You can think of attention as a search algorithm to find relevant information for next-token prediction. Standard attention is basically O(L) brute-force search. We're doing O(L^0.5) jump-search with learned routing: score O(L^0.5) candidate spans, select top-k, then do token-level attention within the selected spans.

      This gives O(L^(3/2)) total complexity while preserving random context access — any token can be selected by content-dependent routing, unlike fixed sliding windows. When you 10x the context length, the search budget only grows by ~3.2x. That subquadratic scaling really matters for long context.

      Performance (Single B200 GPU)

      | Context Length | Prefill (tok/s) | Decode (tok/s) | Memory | |----------------|-----------------|----------------|---------| | 1M tokens | ~20,202 | ~109 | 66 GB | | 10M tokens | ~5,576 | ~76 | ~120 GB |
      

      Key point: 1M → 10M context (10x increase) only drops decode speed by ~30%, not the 10x slowdown with dense attention.

      Why This Matters

      When you have fast long-context inference, usage patterns change. The key is maintaining the cache instead of reprocessing everything:

      - Almost-infinite chat : KV cache in memory for instant responses, save/restore sessions to disk for persistence

      - Document Q &A: Load documents once, ask cross-document questions without reprocessing (our GitHub example: 8 Wikipedia articles with cross- document reasoning)

      - Long-form generation : 20k+ token reasoning on difficult math problems and coherent long article writing, all with maintained context

      Early results: perfect NIAH at 512K context (up from 256K last week), cross- document reasoning working, subquadratic scaling working in practice.

      Since no existing inference engine is going to support our custom kernels, we built the full stack ourselves: Triton kernels, OpenAI-compatible server, session snapshots, chunked prefill, CLI with BM25 RAG.

      Limitations & Next Steps

      Current limitations:

      - This is an architecture + systems feasibility release, not production- quality

      - Limited training data (initial SFT only)

      - Comprehensive evals beyond NIAH still needed

      - FP16 only (66GB for 1M context) — quantization coming soon

      Quantization (coming soon):

      - 4-bit/8-bit quantization to run 1M context on 24GB consumer GPUs

      - Target: RTX 4090 / RTX 5090 with full 1M context

      - 2M context on 48GB cards (e.g., RTX 6000 Ada)

      Hardware support:

      - Currently CUDA only (B200, RTX 6000 Blackwell tested)

      - AMD ROCm port coming (Triton kernels should make this straightforward)

      - Eventually Apple Silicon (harder but not impossible)

      Training & Quality improvements:

      - Scaling up SFT data with more long-context examples

      - Potentially doing continued pretraining on long documents

      - Expanding perfect NIAH range beyond 512K

      - Real-world long-context benchmarks (book QA, codebase analysis, multi- document reasoning)

      New end-user applications : We are planning to develop local-first end- user applications based on this. What would you actually use long context for? Would love to hear specific use cases to help us prioritize.

      ---

      Trying something new is extremely hard. Everyone likes existing transformer architectures — optimizations at every level, predictable scaling laws. But to make truly long-context models practical on local hardware, I think we need new ideas. It doesn't hurt to try, right?

      I'm trying not to spam this sub, so the GitHub repo is the best place to follow progress. Happy to answer questions here though! If you try it and hit issues, open a GitHub issue. And if you have thoughts on long-context use cases, I'd love to hear them.

      Thanks for all the encouragement on the last post!

      Links :

      - 🤗 Model : https://huggingface.co/concavity-ai/superlinear-exp-v0.1

      - 💻 Code : https://github.com/concavity-ai/superlinear

      - 📄 Paper : https://arxiv.org/abs/2601.18401

      submitted by /u/Sad-Size2723
      [link] [comments]

    11. 🔗 News Minimalist 🐢 40% of cancer cases are preventable rss

      In the last 3 days ChatGPT read 90739 top news stories. After removing previously covered events, there are 12 articles with a significance score over 5.5.

      [6.3] Global report finds 40% of cancers preventable —bbc.co.uk(+56)

      A landmark World Health Organization study reveals that seven million annual cancer cases, nearly 40% of the global total, are preventable through lifestyle changes, vaccinations, and reduced environmental pollutant exposure.

      The International Agency for Research on Cancer identified tobacco use, infections like HPV, and alcohol as primary drivers. SSmoking causes 3.3 million annual cases, while infections cause 2.3 million and alcohol 0.7 million.

      The study, published in Nature Medicine, highlights significant regional and gender disparities. Men face higher preventable risks than women, while infections dominate cases in sub-Saharan Africa compared to tobacco-related cancers in Europe.

      [6.0] Last U.S.-Russia pact expires, removing caps on largest atomic arsenals for first time in half-century —ctvnews.ca(+163)

      The New START Treaty between the United States and Russia expired Thursday, removing all limits on the world’s two largest nuclear arsenals for the first time in half a century.

      While Russia offered a one year extension, the United States remained noncommittal, seeking China’s inclusion in a new agreement. Beijing has rejected joining such talks, and both major powers now consider themselves legally free to expand their deployed nuclear forces without treaty inspections.

      Signed in 2010, the pact restricted each nation to 1,550 warheads. Inspections ceased during the pandemic and never resumed, while previous arms control agreements have also been terminated over recent years.

      [6.4] OpenScholar AI model synthesizes scientific research with expert-level accuracy —washington.edu(+2)

      University of Washington and Allen Institute researchers launched OpenScholar, an open-source AI that synthesizes scientific literature and cites sources as accurately as human experts, effectively addressing common AI hallucinations.

      Using retrieval-augmented generation and a database of 45 million papers, the model outperformed general-purpose AI systems. In tests, scientists preferred OpenScholar’s responses to those written by human subject experts 51% of the time while maintaining high factual precision and transparency.

      Highly covered news with significance over 5.5

      [6.2] AI bots create religions and digital drugs on Moltbook, prompting questions about emergent capabilities — theconversation.com (+14)

      [5.5] US Navy fighter jet shoots down Iranian drone in Arabian Sea — apnews.com (+75)

      [5.6] Experimental pill dramatically reduces ‘bad’ cholesterol — utsouthwestern.edu (+20)

      [6.1] Panama cancels Chinese canal port concessions — letemps.ch (French) (+12)

      [5.7] China develops compact microwave weapon to disable satellites — pravda.com.ua (Ukrainian) (+4)

      [5.7] Surgeons kept a man with no lungs alive for 48 hours while waiting for a transplant — zmescience.com (+3)

      [5.6] Germany buys 25.1% stake in Tennet Germany for 3.3 billion euros — nos.nl (Dutch) (+5)

      [5.6] Japan recovers rare earth-bearing seabed sediment in deep-sea test — upi.com (+2)

      [5.6] OpenAI: New coding model GPT-5.3-Codex helped build itself — mashable.com (+97)

      Thanks for reading!

      — Vadim


      You can track significant news in your country with premium.


      Powered by beehiiv

    12. 🔗 Bits About Money Fraud Investigation is Believing Your Lying Eyes rss

      Fraud Investigation is Believing Your Lying
Eyes

      There was recently an attempt by an independent journalist to expose fraud in a Minnesota social program. It was deeply frustrating; the journalist had notably poor epistemic standards, which secondary media seized upon to dismiss their result.

      The class-based sniffing almost invariably noted that prestige media had already reported stories which rhymed with the core allegation, while sometimes implying that makes the allegations less likely to be true, through a logical pathway which is mysterious to me.

      The journalism went quite viral anyway, in part because of sensationalized framing, in part because of signal boosting by an aligned media ecosystem and aligned politicians, and in part because the journalism develops one bit of evidence that has a viscerality that paperwork dives often lack: these purported childcare operations routinely have no children in them.

      Fraud has become quite politicized in the United States the last few years. We had a poorly-calibrated federal initiative led by a charismatic tech entrepreneur which believed it would unearth trillions of dollars of fraud that focused substantial effort on large programs which are comparatively fraud-resistant. Across the aisle, we have reflexive dismissal that fraud happens in social programs, which functions as air cover for scaled criminal operations which loot many varied social programs [0] and are sometimes run out of geopolitical adversaries of the U.S. including by ambiguously-retired members of their clandestine services.

      I worked in the financial industry for a few years. We do not have the luxury of pretending that fraud is something invented by our rivals to besmirch our good name. It hits the P&L every quarter and will eat you alive if you're not at least minimally competent in dealing with it. Conversely, it is well- understood in industry that the optimal amount of fraud is not zero.

      The financial industry has paid at least tens of billions of dollars in tuition here. Overwhelmingly, one learns about fraud in it through an apprenticeship model, with different firms having different internal levels of understanding on the shape of the elephant. The industrial organization presumes small numbers of people architecting anti-fraud systems and relatively larger numbers of investigators and analysts operating those systems on a day-to-day basis.

      There does exist some informal knowledge sharing between firms. If you work in payments, try getting invited to the Chatham House rule sessions held by… oh yeah, can't say. Despite that social technology being originally developed for the benefit of government and press actors, it is my general impression that U.S. benefits programs don't yet see themselves as sufficiently yoked by adversarial attention to benefit from their own Chatham House series. Perhaps that should change.

      And so, for the benefit of fraud investigators with badges, press cards, or GoPros, some observations from a community of practice with an extensive (and mostly nonpublic) body of work. But first a tiny bit of throat clearing.

      In which we briefly return to Minnesota

      Minnesota has suffered a decade-long campaign of industrial-scale fraud against several social programs. This is beyond intellectually serious dispute. The 2019 report from the Office of the Legislative Auditor (a non-partisan government body) makes for gripping reading. The scale of fraud documented and separately alleged in it staggers the imagination: the state's own investigators believed that, over the past several years, greater than fifty percent of all reimbursements to daycare centers were fraudulent. (Separate officials took the… novel position that they were only required to recognize fraud had happened after securing a criminal conviction for it. Since they had only secured a few criminal convictions, there was no way that fraud was that high. Asked to put a number on it, repeatedly, they declined.)

      The investigators allege repeatedly visiting daycare centers which did not, factually, have children physically present at the facility despite reimbursement paperwork identifying specific children being present at that specific time. The investigators demonstrated these lies on timestamped video, and perhaps in another life would have been YouTube stars.

      Our social class is intensely averse to straightforwardly recounting these facts, partly due to political valence and partly due to this particular fraud being dominantly conducted within a community which codes as disadvantaged in the U.S. sociopolitical context.

      Fraudsters are liars and will cheerfully mouth any words they believe will absolve them of their crimes. If an accusation of racism gets one a free pass to steal hundreds of millions of dollars, they will speciously sue you alleging racial discrimination. That empirically worked in Minnesota. The OLA takes explicit notice of this multiple times, a coordinator for the fraud operation is on record explicitly explaining the strategic logic of accusations of racism, and a judge was even moved to make an extraordinary statement to clarify that the bad-faith lawsuit alleging racism did not achieve success through the formal judicial process but rather through the voluntary compliance of governmental actors shamed by its allegations.

      (As a sidenote: one has to be able to hold two thoughts simultaneously about fraudulent operations. They can be sophisticated with respect to exploiting sociopolitical cleavages in their targets while also being comically inept at faking evidence elsewhere, such as having a single person write dozens of adjacent rows in a sign-in sheet. This routinely surprises observers and it should not surprise them. The financial industry also has a division of labor in it. The person architecting the fraud department's standard processes is well-paid, well-educated, and routinely brings crossdisciplinary expertise to bear. A Fraud Analyst I, on the other hand, bears a lot of similarity to a call center employee in terms of compensation, education, and permitted amounts of agency.)

      In the immediate wake of the independent journalist's report, the great and the good rallied around the organizations he accused. Of course it was natural that journalists wouldn't get immediate access to children if they asked. Of course there was a certain amount of informality in the sector. Of course, as the New York Times very carefully wordsmithed recently:

      Minnesota officials said in early January that the state conducted compliance checks at nine child-care centers after Mr. Shirley posted his video and found them "operating as expected," although it had "ongoing investigations" at four of them. One of the centers, which Mr. Shirley singled out because it misspelled the word "Learning" on its sign, has since voluntarily closed.

      An inattentive reader might conclude from this paragraph that the Times disputes Shirley's reporting.

      To the extent that Bits about Money has an editorial line on that controversy, it is this: if you fish in a pond known to have 50% blue fish, and pull out nine fish, you will appear to be a savant-like catcher of blue fish, and people claiming that it is unlikely you have identified a blue fish will swiftly be made to look like fools. But the interesting bit of the observation is, almost entirely, the base rate of the pond. And I think journalism and civil society should do some genuine soul-searching on how we knew--knew --the state of that pond, but didn't consider it particularly important or newsworthy until someone started fishing on camera.

      But this is not a publication about particular ponds. It is a publication about getting better at fishing.

      Common signals, methods, and epiphenomena of fraud

      Fraudsters are playing an iterated game

      The best non-fiction work on fraud is Dan Davies' Lying for Money. In it, you'll find replete examples of something well- known to fraud investigators: the dominant next adventure for a former fraudster is… opening up a new fraud. And therefore, if you want to identify a ridiculously-high-hit-rate list of frauds in round N+1 of a game, a so-easy- its-practically-cheating way to do so is to look at what known fraudsters from round N are doing today.

      There is a genuine difference in the culture and epistemology of the financial industry versus the government of the United States here. In the financial industry, we keep blacklists and getting a second chance after obvious misbehavior is intentionally non-trivial. This runs against deeply felt values of civil servants. An accusation is not a conviction, and absent clear authority to impose consequences in a new program, an actor convicted at enormous societal cost emerges to a new program officer as tabula rasa, equal in moral worth to any randomly chosen citizen.

      I will not argue that Mastercard has better moral intuitions than the Founding Fathers. I would, however, happily suggest that the government not assume that the Constitution contains emanating penumbras obligating it to be repeatedly taken advantage of by the same people in the same fashion. We are not forbidden object permanence.

      Minnesota raided the Sunshine Child Care Center in 2022 on suspicion of overbilling. No charges were brought, in what investigators imply was less an exoneration and more an inter-departmental fumble. That operation was owned by one Fowsiya Hassan. A separate childcare center owned by Fowsiya Hassan was featured on YouTube recently. This follows on $1.5 million of funds received through Feeding Our Future, a scaled fraud operation which has generated over 70 indictments, 5 criminal convictions, and 50 guilty pleas. What a set of coincidences. Perhaps Hassan has, as she has alleged in a lawsuit, been a frequent target of racially-motivated government investigations into a successful serial entrepreneur in the childcare field.

      The fraud supply chain is detectable

      Much of the intellectual energy in policy circles about fraud is aimed at retail-level fraud by individual beneficiaries. Most fraud, like most scaled property crime, is actually the result of a business process.

      This is an elementary fact of capitalism. It is deeply disconcerting to find every benefits program independently rediscovers it a decade too late to do anything about it. Most bread is not baked by amateurs in their kitchens. It comes from a bakery which exists to bake bread and hires specialists in baking bread and then supports them with capital-intensive built infrastructure.

      Fraud develops a supply chain. Some elements in the supply chain are dual-use; the bad guys use Excel for the same reason every business uses Excel. Some elements in the supply chain, though, are specialized infrastructure with no or de minimis legitimate purpose. Those elements can be profiled.

      I worked at Stripe for several years and am currently an advisor there. Stripe does not endorse what I write in my personal spaces. In its own spaces, Stripe has discussed being able to follow fraudulent operations in sufficient detail to determine when the operators went to lunch.

      Fraudsters share specialists quite frequently. They use the same incorporation agents, the same mail services, the same CPAs, the same lawyers, etc.

      You can make the same observation about many communities of practice. It is a non-coincidence that many tech startups are at 548 Market Street in San Francisco. 548 Market Street is not the world's hippest coworking space. It is the address for EarthClassMail in SF. There are many P.O. box providers in the world; many geeks with taste reach for ECM. (Bits about Money is legally required to maintain a postal address and, if you were ever to send it a physical letter, that would also end up in the hands of an EarthClassMail employee.)

      Elsewhere in the world, there exist P.O. box providers whose customers statistically include fewer AI labs and more frauds. One imagines the specialist-in-fraud at the storefront, picking up the day's take from fifteen separate boxes.

      Elementary work graphing supporting infrastructure, even on something as unsophisticated as butcher paper, frequently unravels fraud networks. Data science has any number of more sophisticated approaches. Jetson Leder-Luis, an academic who now routinely works with the government, has previously discussed some approaches which work based on widely commercially available data sources.

      There is an emerging defender's advantage here in the age of LLMs, since exploratory work in visualizing and walking network graphs is getting much cheaper. You no longer need to buy Palantir and engage a "forward-deployed engineer" to cluster IP addresses. A non-technical fraud investigator could get an LLM to do that while eating at Chipotle, and the lunch would cost more.

      This democratization of capabilities is relevant to journalists, formal and otherwise, and also to governments. RFPs and software contracting once de facto mandated a multi-year lead time to do an automated network analysis if an analyst thought perhaps their program might need one. Now that is an afternoon's work, if we allow ourselves to do it. We should.

      Investigators should expect to find ethnically-clustered fraud

      As mentioned, there is enormous visceral distaste for the conclusion that a particular fraud ring operates within a particular community. This is quite common. You should expect to find circumstances which rhyme with it when conducting effective fraud investigations. You should not abandon fraud investigation when you chance upon this.

      People assume a level of ethical fraughtness here which is not warranted. You would, if doing ethnographic work on perfectly legitimate businesses across industries, routinely discover ethnic concentration rather than population- level representation everywhere you looked. The Patels run the motels. One doesn't need to adopt grand theories about how certain groups are predisposed to becoming pharmacists or startup employees or line cooks; simple microeconomic reasoning explains reality easily. Firms hire the people they already know, like, and trust. That will routinely include friends and family, who are going to be much more like the founding team than they are like randomly drawn members of the population. This is the default outcome.

      Fraudsters do have one structural factor here. Everyone wants to trust their coworkers. Fraudsters need to trust their coworkers will be loyal even upon threat of prison time. That necessarily selects for tighter bonds than the typical workplace. Madoff was a family affair, SBF was in an on-again off- again romantic relationship with a chief lieutenant, and neither of those facts is accidental or incidental.

      That's the other ethical dimension of being other-than-blind to concentration: so-called affinity frauds do not merely recruit fraudsters from affinity groups. They recruit victims from affinity groups. Madoff mobilized the social infrastructure of the Jewish community in New York and Palm Beach to find his marks. Community members certainly did not intend their charitable foundations to be looted by a fraudster. It was an emergent consequence of trust networks.

      This also happens to "chosen" communities. FTX was, in material part, an affinity fraud against effective altruists, who are not a religion or ethnic group as traditionally construed.

      And so when the great and the good turn a blind eye towards abuses because the perpetrators share an uncomfortable common factor, they are often simultaneously turning a blind eye towards abuses of a community whose interests they purport to champion.

      High growth rate opportunities attract frauds

      As covered extensively in Lying for Money, the necessary fundamental conceit of a fraud is growth in a business that doesn't happen in the real world. "Every lie told incurs a debt to the truth, and one day, that debt will be paid", to quote the excellent drama mini-series Chernobyl. Fraudsters forestall that day of reckoning by telling a bigger lie, increasing the debt, which (mostly as a side effect) alleges that they're growing much faster than most of your legitimate portfolio. Happily, many businesses have figured out how to keep track of fast-growing customers. Tracking rocketships doesn't require rocket science.

      Sort-by-growth-rate-descending on new accounts will turn up a lot of interesting observations about the world. One is that Fortune 500 companies sometimes open new accounts, and you probably don't need to open a fraud investigation file in that case. Another is that some people claim to be feeding millions of meals to a community of tens of thousands of people, beginning from a standing start, and growing local social services at a rate which an Uber Eats city manager would not expect to achieve in the wildest dreams of their go-to-market plan.

      Feeding Our Future had a CAGR of 578% sustained for 2 years. Uber, during their meteoric growth period in core rideshare services, had an average CAGR of 226%. Their best year was 369%. But, if you asked in Minneapolis in 2021, you'd quickly find someone who had been in an Uber, but fail to find anyone who ate courtesy of Feeding Our Future. So curious, given that they were drubbing one of the fastest growing companies in history on growth rate.

      Investigators in Minnesota were ringing the alarm bells for years about implausibly fast growth in Feeding Our Future's reimbursement requests, including at new facilities. Feeding Our Future felt it was maxed out on the fraud it could conduct at existing sites, and expanded voraciously, including (most prominently) enrolling numerous restaurants as "feeding sites." They then copy/pasted the usual playbook and requested reimbursement for implausible volumes at those sites, paying kickbacks to many participants. This then required growing the fraud, which… you get the general idea. We could have gotten off the bus at many points, and I suppose that is at some level a question of political will.

      The highest growth rates in the economy generally are newer fields (you basically can't sustain the alternative). This doesn't imply that those fields are fraudulent, but they will tend to disproportionately attract frauds. The defenders in those fields have not yet paid their tuition to the School of Hard Knocks, and so attackers target the weaker systems. The higher growth rates of legitimate businesses function as protective cover for high stated growth rates of illegitimate businesses; a CAGR of 1,000% looks implausible for a restaurant but barely-meets-expectations for an AI software shop.

      And, not to put too fine a point on it, many people are invested, literally and metaphorically, in whatever today's new hotness is. People who could not secure an allocation in the more legitimate ends of it will sometimes find themselves adversarially selected by less salubrious actors. This will read to those people as a justly earned success. They might even have their marketing department write up their victimization as an indisputable success.

      And so, if you're a defender who has many different lines of business and has limited resources (or political will), where should you deploy those resources? Should you place your bets on e.g. Social Security, a multi- trillion dollar program whose primary source of growth is fun to conjure but then requires 70 years of seasoning? Or should you place them on the Paycheck Protection Program, or pandemic- era unemployment insurance, or genetic testing, or non-emergency medical transportation? Despite those being smaller line items, they probably have more juice worth squeezing, and the fraud is more easily detectable. Just look.

      Fraudsters find the weakest links in the financial system

      Bits about Money has extensively covered anti- moneylaundering and Know Your Customer regulations and I won't rehash those regimes here. A bit of tacit knowledge in the financial industry: some actors in the set "broadly considered trustworthy" are more worthy of trust than others… and some are less.

      We are generally discreet about writing this down in as many words. But, as an analogy, cross-national regulatory bodies require that financial institutions maintain a list of high-risk jurisdictions to do business in. You are generally required to do enhanced due diligence on customers/activities/etc touching the high-risk list.

      If you are particularly competent, and there are plusses and minuses to being competent in detecting fraud (you will not be the most popular person in the firm at bonus time; that goes to the folks who sold the high-growth accounts), you might have the analogous list of U.S. financial institutions which are not entirely fronts for the bad guys.

      If one hypothetically has that list, that's one more signal you can use in evaluating any particular account, and a one-stop shop for developing a list of accounts to look into. It would be uncouth of me to name an extant bank that has poor controls, but for a general example of the flavor, see my (scathing) commentary on Silvergate's AML and KYC program. Without using any proprietary information, I predict confidently that Silvergate banked many more multi-billion dollar frauds as a percentage of its customer base than almost any of the U.S.'s 4,500 banks. (Trivial substantiation: divide FTXes-banked by total-count-of-customers.)

      One might, if one has never seen the list, wonder whether it is simply proxying for something the financial industry is definitely not allowed to proxy for. One of the first things you learn as a data analyst is zip codes are extremely probative and you are absolutely not allowed to use them. The American system remembers the experience of redlining and has forbidden the financial industry from ever doing it again; the industry mostly respects that. But good news: institutions with weak controls environments are not, in fact, simply a proxy for "Who banks socially disadvantaged people?" There are many financial institutions that have that as an explicit business model. Some of them are good at their jobs. Some, less so, and the fraudsters know it.

      This sometimes happens with the knowing connivance of the financial institution and/or their staff. For much more on that, see histories of the savings and loan crisis, or the Lying for Money chapter on control frauds. But more commonly it is simply a community of practice developing organic knowledge about who is just very easy to get an account with. You need accounts, as a business. As a fraudulent business, which intends to cycle through accounts and identities at a much higher rate than baseline, you would prefer to do business with a bank which will not detect that malfeasance.

      And so you will disproportionately end up banked, with many of your buddies, at the least attentive place still capable of getting a license. And so an agency, trying to find a fraudulent network, might want to look at fraud- cases-by-routing-number and then start making some judgment calls.

      One of the reasons the government has deputized the financial industry is it is good at keeping spreadsheets and quickly responds to requests for them. Perhaps the government should call up a few of their deputies and say "So, not alleging anything here, but we think you might have a list , carefully maintained by your fraud department for your own purposes. We want to see the list. It would be pro-social of you to give us a copy of it."

      Frauds openly suborn identities

      There is a thriving market in identities to be used in fraud. This is because bad actors prefer not putting their own names on paper trails certain to become evidence, because they frequently "burn" themselves early in their careers, and because institutions have cottoned onto the wisdom of collecting lists of ultimate beneficiaries.

      Sometimes this is a social process, conducted at e.g. the dinner table. Sometimes the market is explicitly a market. Jetson recounted that, having exhausted the supply of patients needing dialysis who could plausibly need ambulance services, frauds began bribing potential patients, first with donuts and then with cash. This is extremely common. In Minnesota, parents were recruited to childcare providers with the promise of cash kickbacks or (a detail we'll return to in a moment) fictitious paperworked no-show jobs, sometimes at substantially fictitious companies.

      Fraudsters sometimes exercise some level of operational discipline in their communications. The bad guys have also seen The Wire; they know Stringer Bell's dictum on the wisdom of keeping notes on a criminal conspiracy. However, the population of people willing to be named in a federal indictment over $200 necessarily selects preferentially for individuals who are not experts at operational security. They will sometimes organize recruitment very openly, using the same channels you use for recruiting at any other time: open Facebook groups, Reddit threads, and similar. They will film TikTok videos flashing their ill-gotten gains, and explaining steps in order for how you, too, can get paid.

      As a fraud investigator, you are allowed and encouraged to read Facebook at work.

      Now, knowing that there exists the frequent epiphenomenon where fraudsters recruit strawmen to use their identities to qualify for payments: suppose that you have an entirely new enterprise whose first customers are individuals A, B, C, and D. You know, from past records, that A, B, C, and D have all been customers of an organization which you now know, positively, was a fraudulent actor. You might infer from this that A, B, C, and D might have sold their identities once, but you probably don't have sufficient information to convict them in a court of law of that. (It is of course possible that they are simply unsophisticated, or that bad actors obtained their information without their knowledge, for example by misappropriating a client list from a previous corporate entity they happened to own/work for/etc.)

      But do you have enough information to take a more-detailed-than-usual look at this totally new enterprise? I think you do.

      Asymmetry in attacker and defender burdens of proof

      We have choices, as the defender, in what levels of evidence we require to enter the circle of trust, what our epistemological standards are, and how much evidence we require to forcibly exit someone from the circle of trust.

      A detail from the Minnesota cases is that these burdens are asymmetric, in a way which disadvantages the defender (all of us). That decision is a choice and we should make better choices.

      For example, the primary evidence of a child attending a day-care was a handwritten sign-in sheet of minimal probative value. Prosecutors referred to them as "almost comical" and "useless." They were routinely fraudulently filled out by a 17 year old "signing" for dozens of parents sequentially in the same handwriting, excepting cases where they were simply empty.

      To refute this "evidence", the state forced itself to do weeks of stakeouts, producing hundreds of hours of video recording, after which it laboriously reconstructed exact counts of children seen entering/exiting a facility, compared it with the billing records, and then invoiced the centers only for proven overbilling.

      On general industry knowledge, if you are selected for examination in e.g. your credit card processing account, and your submission of evidence is "Oh yeah, those transactions are ones we customarily paperwork with a 17 year old committing obvious fraud", your account will be swiftly closed. The financial institution doesn't have to reach a conclusion about every dollar which has ever flowed through your account. What actual purpose would there be in shutting the barn door after the horse has left? The only interesting question is what you'll be doing tomorrow, and clearly what you intend to do tomorrow is fraud.

      We can architect the asymmetry in the other fashion: legitimate businesses will customarily, as a fact of their operations, put enormous effort into creating visible effects in the world which are trivial to check. In technologist circles this is sometimes called a "proof of work" function.

      Once upon a time, a team of fraud analysts asked how they could possibly determine frauds from non-frauds without having extensive industry knowledge about every possible commercializable human activity. I suggested that a good first pass was "Just ask the correspondent for a quick video, shot on their cell phone, of their workspace."

      That is minimally invasive for the business owner, generates a huge amount of signal (including that which can be correlated across accounts), and can be usefully adjudicated by non-specialists in a minute. No multi-month stakeout of their storefront is required. Of course you can convincingly fake a video of working in, say, a machine shop, but fraudsters maintaining spreadsheet row 87 about the machine shop will find that difficult to juggle with all the other required lies in their backlog. Actual machine shops, meanwhile, include people, which means they include functional cell phone cameras at no additional cost to anyone.

      You can also get some signal from who can trivially produce a video and who needs a week of advance notice to find a cell phone to record those machines that were absolutely milling aluminum last week.

      Fundamentally, we have a choice about where we put our investments in defanging fraud, and we should stop choosing to lose.

      So-called "pay-and-chase", where we put the burden on the government to disallow payments for violations retrospectively, has been enormously expensive and ineffective. Civil liability bounces off of exists-only-to- defraud LLC. Criminal prosecutions, among the most expensive kinds of intervention the government is capable of doing short of kinetic war, result in only a ~20% reduction in fraudulent behavior. Rearchitecting the process to require prior authorization resulted in an "immediate and permanent" 68% reduction. (I commend to you this research on Medicare fraud regarding dialysis transport. And yes, the team did some interesting work to distinguish fraudulent from legitimate usage of the program. Non-emergency transport for dialysis specifically had exploded in reimbursements--see Figure 1-- not because American kidneys suddenly got worse but because fraudsters adversarially targeted an identified weakness in Medicare.)

      Attackers carefully respond to signals they think they are being sent from defenders. A lawyer for some of the Minnesota defendants, Ryan Pacyga, was quoted by the New York Times as saying that his clients understood Minnesota to tacitly allow their actions.

      No one was doing anything about the red flags. … It was like someone was stealing money from the cookie jar and they kept refilling it.

      Don't be the defender who sends that message. It will not work out well for you or your program.

      Fraudsters under-paperwork their epiphenomena

      Most frauds have rich external lives, with a soaring narrative of how deserving people are getting valuable services (and/or getting rich for being right and early regarding e.g. crypto asset cross-margining). They tend to be distinctly underpaperworked internally, partly because a synonym for "paperwork" is "evidence" and partly because… most frauds aren't really that sophisticated, when it comes down to it. There is a true number; lie about it; done.

      Like many time-pressed entrepreneurs busy talking to potential customers, fraudsters put the minimal amount of time necessary into bookkeeping and even less than that into paperworking epiphenomena of their frauds. One example of epiphenomena is sometimes the beneficiaries need their own paperwork. A legitimate mortgage company employs sales reps and a backoffice to help unsophisticated customers successfully get several hundred pages of paperwork together to sell a mortgage. Frauds… mostly don't do that.

      And so, if you have e.g. a statutory requirement that a beneficiary be employed to access services, a fraudster might say "Don't worry about it!" They'll just assert that you are an employee at a cleaning company. Perhaps they might even go as far as payrolling you as an employee of a cleaning company. This kills two birds with one stone, paying you your kickback while also generating the paystub they need you to have to qualify for the government reimbursement. (This happened, per the OLA's reports summarizing the results of many investigations, in Minnesota.)

      But fraudsters don 't actually operate cleaning companies _even in those cases where they _do operate daycares.

      Cleaning companies are legitimate businesses, in the main, and working for one is an honest occupation. And so a fraud investigator should feel no chagrin at calling a cleaning company in the phone book and asking for a quote. A cleaning company which expresses complete befuddlement that someone could ask for a quote is providing, ahem, evidence in a direction.

      (I have to note, as someone who pays to send children to a private school, that there is replete evidence that the school is accepting new children, knocking on the door and asking will quickly result in being given a brochure, and there are scheduled open houses and similar. I can imagine a gratuitously mismanaged educational establishment which does none of these things, and I can imagine an educational establishment which makes a lot of money, but I have trouble holding both thoughts in my head at the same time.)

      The core frauds are sometimes hardened, to an attenuated degree. The peripheral frauds collapse under even a glance. Architect processes to require more signals regarding the periphery, then architect a system which takes at least a cursory look at the periphery. You will trivially catch frauds.

      If you're worried about exposing the exact signal that you are using, costing utility of it in the future, you can use this as a "parallel construction" engine. Develop leads for investigation using the non-public signal, pull the core records as a matter of routine, find the discrepancies that all frauds leave in their core records, and then put those in the indictment. Ask your friendly neighborhood lawyer if that passes muster or if you need to add a sentence rhyming with "was selected for a routine audit on the basis of information available to the department."

      Machine learning can adaptively identify fraud

      We have discussed some heuristics [1] for identifying fraud. The financial industry still makes material use of heuristics, but a heuristic is a compression of the real world. It will sometimes lose fidelity to the world. It will frequently, by design , be legible to the adversary.

      The defender has one advantage the attacker cannot ever replicate: data at scale. It knows what legitimate use looks like because it has all the messy, contradictory, varying quality, typos-and-all data which legitimate businesses in the real world constantly throw off. You cannot duplicate all of the shadows on the wall of Plato's cave without first duplicating the entire world. Fraudsters, even quite talented ones, can't do that.

      There are any number of techniques for machine learning in anti-fraud; Emily Sands has previously discussed some with me. An important subset of the field can adapt in real-time or close to it to changes in adversary (or legitimate!) behavior. For example, covid surprised the fraudsters at the same time as it surprised every supermarket in the country, but the ex-post actions of the fraudsters and the supermarkets were very different. Revenue went up for both, but only one group actually runs a supermarket. And so by ingesting and constantly analyzing data from all users, including retrospective annotation of which users you've identified to be frauds, you get better and earlier signals on which users are likely fraudulent and which are likely not.

      This can inform outright interdiction or the investigate-then-punish loop that we ordinarily expect from government. It can also inform less consequential, easier-to-reverse interventions. For example, rather than putting all users immediately through the highest-possible-ceremony process for application, you can let most users do a lower-burden process, saving the higher levels of scrutiny for those which signal greater likelihood of being fraudulent. Or you can default to approving more applicants and reserve more of your investigatory budget for post-approval review, with this being equivalently costly by using better tasking of those reviews versus random allocation. Pay- and-chase becomes more palatable if it is not pay-and-pay-and-pay-and-pay-and- chase and more pay-until-we-decide-to-chase-but-stop-payments-at-that- decision-not-after-the-catching.

      Machine learning isn't simply useful from a perspective of decreasing fraud. The history of regulation of benefits programs is the history of too-late, too-harsh overcorrection to notorious abuses. Much of what advocates find most maddening and Kafkaesque about eligibility criteria and application processes was voted on by a legislature but bears the signature of a fraudster with a novel idea.

      With a good machine learning practice, you can increase data ingested but decrease the burdensome formal application/etc requirements. This is in no small part because those data points are less probative (they are under the direct control of the attacker and announce that they will be scrutinized). But it bears a dividend: if you better control fraud, and can successfully demonstrate that to the public and legislators, you can decrease application burden and perhaps even widen eligibility criteria. Those are both in the direct interests of potential marginal beneficiaries.

      A political commentator might focus more on the optics here than on the substance, because that is so frequently where the point of actual leverage is in politics. But the substantive reality of fraud losses matters. It is much easier to tell the story of fraud in benefits programs being rare, opposed by all right-thinking people, and swiftly sanctioned when that story is not an obvious lie.

      Frauds have a lifecycle

      You can read Lying for Money or other histories of frauds for more detail on the texture, but in the main, a dedicated fraudulent enterprise is created, is seasoned for a while before crossing the rubicon, has a period of increasing brazenness, is detected, is closed, and then is resurrected when the fraudster gets the band back together from round N+1.

      We can intervene against the lifecycle model if we understand it. This begins with not defaulting to the understanding of investigators that frauds are isolated incidents by disparate individual actors. Those have been known to happen, but frauds are, by total damage, dominated by repeatable business models perpetrated by professional specialized bad actors. We should study them like we study other successful entrepreneurs, and then not invest in them.

      One actionable insight from the lifecycle model: because the fraudster intends to be in business multiple times in their life, we should track the person-to- business mapping much more closely than we have historically. As Lying for Money says, if you're an accountant and willing to go to prison, and you do not get rich via fraud… well, you are very bad at your job. That's on you. When we give you repeated chances to do it , that's on us.

      One might think that the simplest imaginable reform is passing some sort of beneficial ownership regulation to unroll complex corporate structures designed to obscure who is actually puppeting Totally Not A Fraud, LLC. But the simplest imaginable reform is probably just actually reading corporate filings that already exist and are public. Again, most fraudsters are not the hypersophisticated Moriarties of the popular imagination. The Minnesota fraudsters frequently did not even bother with fig leaves. While they did find some nominee directors in some cases, many of the convicted operated their companies in their own names, with no complicated structuring at all. Sometimes multiple times, consecutively, after the previous entities had worn out their welcome with Minnesota.

      The Fed should not be surprised when the bad guys buy a bank when buying a bank requires an extended permission-seeking process and the bad guy's corporate records, dutifully recorded by Maryland (entity D20033544), are signed by a notorious bagman. In the Fed's defense, the bagman lied to them about his intentions , which was outside of their world model. (Pip pip to the New York Times for figuring that out before the Fed did. That is, sadly, not the usual way it works in financial journalism.)

      Should we care about fraud investigation, anyway?

      Responsible actors in civil society have a mandate to aggressively detect and interdict fraud. If they do not, they cede the field to irresponsible demagogues. They will not be careful in their conclusions. They will not be gentle in their proposals. They will not carefully weigh consequences upon the innocent. But they will be telling a truth that the great and the good are not.

      The public will believe them, because the public believes its lying eyes.

      [0] In a thing you will see frequently in fraud investigations, early detection of anomalies does not necessarily imply successful identification of the underlying fraudulent enterprise. A teacher was scandalized that a third of their students are using AI to write papers. Those "students" are identities puppeted by a criminal organization to siphon federal funding out of community colleges towards accounts controlled by the criminals. (I award myself one cookie for correctly predicting this.)

      [1] A heuristic, in industry parlance, is a hard-coded rule or set of rules as opposed to a system which automatically adapts to changes in the underlying data. Compare the difference between "You are less likely to default on loans if you own versus renting", which is absolutely demonstrable in aggregate data, versus "You are less likely to default on loans at 780 FICO versus 540 FICO." For a variety of reasons, the culture that is legislators sees the problem with having one heuristic, which will obviously not come to the correct conclusion all of the time. It corrects for this issue by having several hundred pages of heuristics. Just one more heuristic, man, and we'll have completely anticipated all the complexity of the world.

      Heuristics are wonderful things! They're cheap to adjudicate, easy to explain, and can be understood by lawyers, even the kind who have ascended from the practice of law to the writing of it. Happily, machine learning systems can have all of these properties if you make them priorities.

    13. 🔗 r/york Would anyone like these? rss

      Would anyone like these? | I have two of these gift vouchers for a free photoshoot plus a framed photo and £150 towards any images, prints or products. All the locations are a bit too far for me but willing to offer them at a heavily discounted price if anyone wants em. They can't be used together though and it says they need to be booked by tomorrow but maybe there's some leeway with that. DM me if interested! submitted by /u/IndependentSheffield
      [link] [comments]
      ---|---

    14. 🔗 r/wiesbaden I’m an American citizen with no address, I need to send mail to a German address (buying from German site) rss

      What can I do?

      submitted by /u/Careful-Foot8399
      [link] [comments]

    15. 🔗 r/york York Christmas Market footfall down, organisers say rss

      York Christmas Market footfall down, organisers say | submitted by /u/Kagedeah
      [link] [comments]
      ---|---

    16. 🔗 r/reverseengineering [Project] An open-source Windows RAT for learning offensive security techniques rss
    17. 🔗 r/LocalLLaMA CPU-only, no GPU computers can run all kinds of AI tools locally rss

      CPU-only, no GPU computers can run all kinds of AI tools locally | While it’s great that so many people on LocalLLaMA are pushing the envelope with what can be done locally with expensive setups, we need to remember that a lot can be done with very minimal machines. I’m talking about CPU-only locally run LLMs. That’s right, no GPU! I’m running Linux Mint on an old Dell optiplex desktop with an i5-8500 processor, 6 threads and 32GB of RAM. You can pick up one of these refurbished for something like $120. And with this humble rig I can: Run 12B Q4_K_M gguf LLMs using KoboldCPP. This allows me to have local chatbot fun using quite highly rated models from https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard. Response times are fast enough as long as you keep the initial prompt below 800 tokens. And with context-shifting it remembers stuff during the session. Uncensored, private RP hilarity for free! You can even add in kokoro_no_espeak for text to speech so your RP characters talk to you with only a few seconds delay. The trick is to find good models to use. For example, DreadPoor/Famino-12B-Model_Stock is rated a 41+ on writing, which is better than many 70B models. You don’t need big horsepower for fun. You can also use these models for writing, coding and all sorts of applications. Just need the patience to try out different local models and find the settings that work for you. I also run Stable Diffusion 1.5 locally for basic image generation, inpainting and so on. Again using KoboldCPP and Stable UI. OK, it takes 3 minutes to generate a 512x512 image but it works fine. And you can experiment with loras and many SD 1.5 models. All 100% free on old gear. I’m also running Chatterbox TTS for voice cloning voice-over projects. Works surprisingly well. Again, it takes a couple of minutes to generate a 75 word audio clip, but it does work. Vibevoice TTS also works on this old rig but I prefer Chatterbox. And then there are amazing tools like Upscayl which upscales images locally incredibly well. Just gotta experiment with the models. I’ve used ollama transcriber which converts audio files into text amazingly well. Just point a spoken word .WAV at it and then go make dinner and when I get back, the text is there. There are many other local LLMs and tools I’ve used. These are just the tip of the iceberg. Video? Nope. Music generation? Nope. I’ve looked and tried a few things but those big resource tasks need serious horsepower. However, it’s quite possible to use your old desktop computer for text-based tasks and then rent online GPU for one-off tasks and use the big online services for other tasks. It would still probably work out to be less costly. I know I’m not the only one doing this. CPU-only people: tell us how you’re using AI locally... submitted by /u/JackStrawWitchita
      [link] [comments]
      ---|---

    18. 🔗 r/Yorkshire Early Morning. 6°C and blue skies. Snowdrops. rss

      Early Morning. 6°C and blue skies. Snowdrops. | @weatherworlds submitted by /u/AnfieldAnchor
      [link] [comments]
      ---|---

    19. 🔗 r/Yorkshire East Riding council tax to increase by 4.99% rss

      East Riding council tax to increase by 4.99% | submitted by /u/Kagedeah
      [link] [comments]
      ---|---

    20. 🔗 r/Leeds Leeds Core doesn’t have a core anymore! rss

      Work is well underway!

      submitted by /u/Mr-Dionysus
      [link] [comments]

    21. 🔗 r/york Why is the River Ouse tidal into York? rss

      Why is the River Ouse tidal into York? | I don’t remember seeing tidal peaks in York at the Viking Recorder before. Has something changed downstream? The lock/wear at Naburn is the usual the limit of tidal peaks in my memory. submitted by /u/dawnriser
      [link] [comments]
      ---|---

    22. 🔗 r/york Selling Anna Lapwood ticket - 05 June via Ticketmaster rss

      Hi all!

      Mods - apologies if this post is not allowed on this sub. I'm unable to make the Anna Lapwood concert on 05 June, so I've listed my ticket on the official Ticketmaster website.

      Posting here for visibility.

      https://secure.ticketmaster.co.uk/rs/3600636CE7D4A2A4/l047kg0d

      submitted by /u/203203_again
      [link] [comments]

    23. 🔗 r/york where can i buy cheap secondhand stationery, utensils/dishware, etc. rss

      i'm a visiting student so don't want to spend money on getting lots of things, but realizing that i need some basics like notebooks, scissors, tape... and i have no dishware to cook with! any suggestions would be so appreciated - i've tried some charity shops but it's very hit or miss for me.

      submitted by /u/biology-class
      [link] [comments]

    24. 🔗 r/reverseengineering [Challenge] The Enigma Protector 8.0 is released so here it is! The Hello World program to reverse! (CPP + TEP) rss
    25. 🔗 r/LocalLLaMA No NVIDIA? No Problem. My 2018 "Potato" 8th Gen i3 hits 10 TPS on 16B MoE. rss

      No NVIDIA? No Problem. My 2018 "Potato" 8th Gen i3 hits 10 TPS on 16B MoE. | I’m writing this from Burma. Out here, we can’t all afford the latest NVIDIA 4090s or high-end MacBooks. If you have a tight budget, corporate AI like ChatGPT will try to gatekeep you. If you ask it if you can run a 16B model on an old dual-core i3, it’ll tell you it’s "impossible." I spent a month figuring out how to prove them wrong. After 30 days of squeezing every drop of performance out of my hardware, I found the peak. I’m running DeepSeek-Coder-V2-Lite (16B MoE) on an HP ProBook 650 G5 (i3-8145U, 16GB Dual-Channel RAM) at near-human reading speeds. #### The Battle: CPU vs iGPU I ran a 20-question head-to-head test with no token limits and real-time streaming. | Device | Average Speed | Peak Speed | My Rating | | --- | --- | --- | --- | | CPU | 8.59 t/s | 9.26 t/s | 8.5/10 - Snappy and solid logic. | | iGPU (UHD 620) | 8.99 t/s | 9.73 t/s | 9.0/10 - A beast once it warms up. | The Result: The iGPU (OpenVINO) is the winner, proving that even integrated Intel graphics can handle heavy lifting if you set it up right. ## How I Squeezed the Performance: * MoE is the "Cheat Code": 16B parameters sounds huge, but it only calculates 2.4B per token. It’s faster and smarter than 3B-4B dense models. * Dual-Channel is Mandatory: I’m running 16GB (2x8GB). If you have single-channel, don't even bother; your bandwidth will choke. * Linux is King: I did this on Ubuntu. Windows background processes are a luxury my "potato" can't afford. * OpenVINO Integration: Don't use OpenVINO alone—it's dependency hell. Use it as a backend for llama-cpp-python. ## The Reality Check

      1. First-Run Lag: The iGPU takes time to compile. It might look stuck. Give it a minute—the "GPU" is just having his coffee.
      2. Language Drift: On iGPU, it sometimes slips into Chinese tokens, but the logic never breaks.

      I’m sharing this because you shouldn't let a lack of money stop you from learning AI. If I can do this on an i3 in Burma, you can do it too. ## Clarifications Edited For those looking for OpenVINO CMAKE flags in the core llama.cpp repo or documentation: It is not in the upstream core yet. I am not using upstream llama.cpp directly. Instead, I am using llama-cpp-python, which is built from source with the OpenVINO backend enabled. While OpenVINO support hasn't been merged into the main llama.cpp master branch, llama-cpp- python already supports it through a custom CMake build path. Install llama- cpp-python like this: CMAKE_ARGS="-DGGML_OPENVINO=ON" pip install llama-cpp- python Benchmark Specifics
      For clarity, here is the benchmark output. This measures decode speed (after prefill), using a fixed max_tokens=256, averaged across 10 runs with n_ctx=4096.
      CPU Avg Decode: ~9.6 t/s
      iGPU Avg Decode: ~9.6 t/s
      When I say "~10 TPS," I am specifically referring to the Decode TPS (Tokens Per Second), not the prefill speed. You can check the detailed comparison between DeepSeek-V2-Lite and GPT-OSS-20B on this same hardware here: [[https://www.reddit.com/r/LocalLLaMA/comments/1qycn5s/deepseekv2lite\_vs\_gptoss20b\_on\_my\_2018\_potato/?utm\_source=share&utm\_medium=web3x&utm\_name=web3xcss&utm\_term=1&utm\_content=share\_button](https://www.reddit.com/r/LocalLLaMA/comments/1qycn5s/deepseekv2lite%5C_vs%5C_gptoss20b%5C_on%5C_my%5C_2018%5C_potato/?utm%5C_source=share&utm%5C_medium=web3x&utm%5C_name=web3xcss&utm%5C_term=1&utm%5C_content=share%5C_button%5C)] submitted by /u/RelativeOperation483
      [link] [comments]
      ---|---

    26. 🔗 r/reverseengineering Jump table detection in the rev.ng decompiler (rev.ng hour 2023-11-17) rss
    27. 🔗 w00tzenheimer/d810-ng v0.1.0 release

      What's Changed

      • more optimizations by @w00tzenheimer in #1
      • more optimizations by @w00tzenheimer in #2
      • Fix a lot of failing rules (or at least clarify them). by @w00tzenheimer in #3
      • Enhance AST processing with new optimizations🚀✨** by @w00tzenheimer in #4
      • More updates to fix constant folding, now working. by @w00tzenheimer in #6
      • chore: migrate PyQt5 to PySide6 by @hellodword in #9
      • test(samples): Add comprehensive obfuscation test cases and binaries by @mahmoudimus in #15
      • vendor: Bundle clang, typing_extensions, and ida_reloader dependencies by @mahmoudimus in #14
      • build: Add pytest/coverage configuration and development tooling by @mahmoudimus in #13
      • feat(core): Add foundational infrastructure modules by @mahmoudimus in #16
      • config: Add optimizer configurations for various obfuscation patterns by @mahmoudimus in #17
      • feat(hexrays): Add deferred CFG modifier and enhanced tracking utilities by @mahmoudimus in #18
      • feat(expr): Add portable AST, emulation oracle, and enhanced Z3 utilities by @mahmoudimus in #19
      • feat(mba): Add comprehensive MBA simplification rules by @mahmoudimus in #21
      • feat(mba): Add DSL, constraint system, and multi-backend infrastructure by @mahmoudimus in #20
      • feat(testing): Add testing framework infrastructure by @mahmoudimus in #26
      • feat(optimizers): Add optimizer core infrastructure by @mahmoudimus in #28
      • feat(flattening): Add comprehensive unflattening framework by @mahmoudimus in #29
      • feat(speedups): Add Cython speedups infrastructure by @mahmoudimus in #31
      • fix: Port targeted bug fixes from cfg-audit by @mahmoudimus in #34
      • Remove duplicate data by @zmer007 in #32
      • feat: Egraph optimizer, D810.py rewrite, Qt shim safety, core cleanup by @mahmoudimus in #35

      New Contributors

      Full Changelog : https://github.com/w00tzenheimer/d810-ng/commits/v0.1.0

    28. 🔗 r/LocalLLaMA I am absolutely loving qwen3-235b rss

      I installed qwen3-235b on my desktop system, and I had to join here to brag about it. It's such a careful model, the accuracy of it's output is unbelievable and I've found myself using it absolutely constantly to the point my chatgpt pro subscription is getting left behind. The ability to get carefully curated information of this quality from your own desktop PC is astounding to me and for my use puts all the commercial subscriptions to shame. Sorry for the rant lol!

      submitted by /u/TwistedDiesel53
      [link] [comments]

    29. 🔗 r/reverseengineering [Challenge] The Enigma Protector 8.0 is released so here it is! The Hello World program to reverse! (Python Nuitka + TEP) rss
    30. 🔗 matklad CI In a Box rss

      CI In a Box

      Feb 6, 2026

      I wrote box, a thin wrapper around ssh for running commands on remote machines. I want a box- shaped interface for CI:

      const repository = "git@forge.com/me/my-project";
      const commit_sha = Deno.env["COMMIT"];
      
      const runners = await Promise.all(
          ["windows-latest", "mac-latest", "linux-latest"]
              .map((os) => $`box create ${os}`)
      );
      
      await Promise.all(runners.map(async ($runner) => {
          await $`box run ${runner}
              git clone ${repository} .`;
      
          await $`box run ${runner}
              git switch --detach ${commit_sha}`;
      
          await $`box run ${runner}
              ./zig/download.ps1`;
      
          await $`box run ${runner}
              ./zig/zig build test`;
      }));
      

      That is, the controlling CI machine runs a user-supplied script, whose status code will be the ultimate result of a CI run. The script doesn’t run the project’s tests directly. Instead, it shells out to a proxy binary that forwards the command to a runner box with whichever OS, CPU, and other environment required.

      The hard problems are in the ["windows-latest", "mac-latest", "linux- latest"] part:

      • One of them is not UNIX.
      • One of them has licensing&hardware constraints that make per-minute billed VMs tricky (but not impossible, as GitHub Actions does that).
      • All of them are moving targets, and require someone to do the OS upgrade work, which might involve pointing and clicking.

      CI discourse amuses me — everyone complains about bad YAML, and it is bad, but most of the YAML (and associated reproducibility and debugging problems) is avoidable. Pick an appropriate position on a dial that includes

      What you can’t just do by writing a smidgen of text is getting the heterogeneous fleet of runners. And you need heterogeneous fleet of runners if some of the software you are building is cross-platform.


      If you go that way, be mindful that

      The SSH wire protocol only takes a single string as the command, with the expectation that it should be passed to a shell by the remote end.

      Colin Watson on SSH quoting

      In other words, while SSH supports syntax like ssh $HOST cmd arg1 arg2, it just blindly intersperses all arguments with a space. Amusing to think that our entire cloud infrastructure is built on top of shell injection!

      This, and the need to ensure no processes are left behind unintentionally after executing a remote command, means that you can’t “just” use SSH here if you are building something solid.

  4. February 05, 2026
    1. 🔗 IDA Plugin Updates IDA Plugin Updates on 2026-02-05 rss

      IDA Plugin Updates on 2026-02-05

      New Releases:

      Activity:

      • capa
        • 26aba806: loader: handle SegmentationViolation for malformed ELF files (#2799)
        • 3582bce6: vmray: skip processes with invalid PID or missing filename (#2807) (#…
        • 535faf28: build(deps): bump protobuf from 6.33.1 to 6.33.5 (#2851)
        • fe273351: build(deps): bump pip from 25.3 to 26.0 (#2847)
        • a40ae162: build(deps): bump dnfile from 0.17.0 to 0.18.0 (#2848)
        • 1500a349: build(deps): bump rich from 14.2.0 to 14.3.2 (#2849)
      • CTFStuff
        • e41f501e: not breaking everythin i hope
      • d810-ng
        • b03d7c11: feat(flattening): Add comprehensive unflattening framework (#29)
        • 4502277d: feat(optimizers): Add optimizer core infrastructure (#28)
      • distro
        • 6616314e: Add APT dependency info extraction to remnux-diag
      • ida-claude-plugins
      • idasql
        • 4470dbab: fix: skip plugin loading under idalib
    2. 🔗 r/Leeds Looking for a Leeds Spot to Stream the Artemis II Moon Launch? rss

      I'm a massive space nerd and am super excited for the Artemis 2 launch back to the moon! However aside from watching a steam on YouTube I'm not sure if there's anywhere I can catch the launch from in Leeds. it'll likely be at an early hours in the morning, But are there any places in Leeds area which are hosting watch parties, Would love to know

      submitted by /u/Lord_lammington
      [link] [comments]

    3. 🔗 badlogic/pi-mono v0.52.6 release

      Breaking Changes

      • Removed /exit command handling. Use /quit to exit (#1303)

      Fixed

      • Fixed /quit being shadowed by fuzzy slash command autocomplete matches from skills by adding /quit to built-in command autocomplete (#1303)
      • Fixed local package source parsing and settings normalization regression that misclassified relative paths as git URLs and prevented globally installed local packages from loading after restart (#1304)
    4. 🔗 r/Leeds Anyone interested in starting a band? rss

      37M, very decent guitarist and spend a lot of years DJing house and techno and producing it. Really want to get back into playing music, potentially incorporating some electronic aspects.

      Very much open to genre and style but nothing too heavy (ie not metal)

      This is a shot in the dark but DM if interested and let’s see if we vibe.

      submitted by /u/angosturacampari
      [link] [comments]

    5. 🔗 eryx-org/eryx eryx-macros-v0.3.0 release

      No content.

    6. 🔗 r/york Six Nations rss

      Anywhere showing the men’s six nations this year?

      submitted by /u/CharacterEvening4886
      [link] [comments]

    7. 🔗 r/york Chicken Wings rss

      I’ve just moved back to the city and I’m out of the loop. Where does good crispy chicken wings and lots of them ? I know it’s not fine dining but I’ve suddenly developed a craving for them

      submitted by /u/JarJarBinksSucks
      [link] [comments]

    8. 🔗 badlogic/pi-mono v0.52.5 release

      Fixed

      • Fixed thinking level capability detection so Anthropic Opus 4.6 models expose xhigh in selectors and cycling
    9. 🔗 r/york Bass lessons in York? rss

      Hi guys, got a question which I think is best asked here. I used to play bass as a teenager but I've been down a bit of a jazz rabbit hole recently and have been noodling on my bass again since November...

      Only I suck at theory and also lack direction atm. Was hoping to get some tutoring to address this so does anyone know anyone who does lessons in or around the city?

      submitted by /u/RhyeJam
      [link] [comments]

    10. 🔗 badlogic/pi-mono v0.52.4 release

      Fixed

      • Fixed extensions setting not respecting package.json pi.extensions manifest when directory is specified directly (#1302 by @hjanuschka)
    11. 🔗 badlogic/pi-mono v0.52.3 release

      Fixed

      • Fixed git package parsing fallback for unknown hosts so enterprise git sources like git:github.tools.sap/org/repo are treated as git packages instead of local paths
      • Fixed git package @ref parsing for shorthand, HTTPS, and SSH source formats, including branch refs with slashes
      • Fixed Bedrock default model ID from us.anthropic.claude-opus-4-6-v1:0 to us.anthropic.claude-opus-4-6-v1
      • Fixed Bedrock Opus 4.6 model metadata (IDs, cache pricing) and added missing EU profile
      • Fixed Claude Opus 4.6 context window metadata to 200000 for Anthropic and OpenCode providers
    12. 🔗 r/Yorkshire Looking for JDM cars to attend my best friend’s funeral. rss
    13. 🔗 r/LocalLLaMA BalatroBench - Benchmark LLMs' strategic performance in Balatro rss

      BalatroBench - Benchmark LLMs' strategic performance in Balatro | If you own a copy of Balatro, you can make your local LLM play it. I built tools to let LLMs play Balatro autonomously. The LLM gets the game state as text, decides what to do (play, discard, buy from shop...), and the action executes in the actual game. No hard-coded heuristics — all decisions come from the LLM. BalatroBot is a mod that exposes an HTTP API for game state and controls. BalatroLLM is the bot framework — it works with any OpenAI-compatible endpoint (Ollama, vLLM, etc.). You can write your own strategy (Jinja2 templates that define how game state is prompted and what the LLM's decision philosophy should be). Different strategies lead to very different results with the same model. Benchmark results across various models (including open-weight ones) are on BalatroBench Resources: - BalatroBot: Balatro mod with HTTP API - BalatroLLM: Bot framework — create strategies, plug in your model - BalatroBench: Leaderboard and results (source) - Discord PS: You can watch an LLM struggling to play Balatro live on Twitch - rn Opus 4.6 is playing submitted by /u/S1M0N38
      [link] [comments]
      ---|---

    14. 🔗 @binaryninja@infosec.exchange Command Palette is getting a serious upgrade in the upcoming Jotunheim mastodon

      Command Palette is getting a serious upgrade in the upcoming Jotunheim release! Beyond actions, you can now search functions and symbols, types, strings, open tabs, and even project files, all from the keyboard. Read about it in our latest blog post: https://binary.ninja/2026/02/05/command-palette- updates.html

    15. 🔗 r/Yorkshire Ideas for a very brief Yorkshire Dales tour with my elderly dad? (Coming from NY) rss

      Hello,

      My father, who is 91, and I will be meeting in London in late-May. He hasn't flown in possibly 20 years, doesn't like it, and he's not so ambulatory - uses a cane, walks very slowly; he'll be the first to admit he's impatient with folks and crowds, etc. (He'd be flying from the east coast of the U.S. with my stepbrother; I'm coming from the west coast.) But it's a very special occasion because a play he's been involved with for about 50 years is finally coming to fruition as a musical, and the producers are paying, so we've encouraged him to go to this once-in-a-time event, and I think he's even gotten excited about it. My brother and his son will also be joining.

      Meanwhile, my father and I are both watching and loving "All Creatures Great and Small." We talk about how gorgeous are the Dales. Given the variables above, do you have any thoughts about a short, and very easy, trip from London? I'm thinking maybe we take the train up to York, spend the night, next day take a day tour into the gloriousness (an "All Creatures" focus?; doesn't have to be; that may be too on the nose for him, and just seeing the landscape may suffice), come back and spend the night in York, go back to London the next day, then he and stepbrother return to NY. Maybe it would tack on an extra 3 days.

      I have no idea if he would go for this little idea of mine, but before I even present it to him, I wanted to hear your thoughts... It would be so special for me to do this with him, of course - that is, if he's up for it.

      Thanks!

      submitted by /u/icycoldplum
      [link] [comments]

    16. 🔗 badlogic/pi-mono v0.52.2 release

      Changed

      • Updated default model for anthropic provider to claude-opus-4-6
      • Updated default model for openai-codex provider to gpt-5.3-codex
      • Updated default model for amazon-bedrock provider to us.anthropic.claude-opus-4-6-v1:0
      • Updated default model for vercel-ai-gateway provider to anthropic/claude-opus-4-6
      • Updated default model for opencode provider to claude-opus-4-6
    17. 🔗 badlogic/pi-mono v0.52.1 release

      No content.

    18. 🔗 r/Leeds Headingley traffic. rss

      Traffic in headingley seriously needs to be sorted, on bus right now and weve been sat in headingley for 40 minutes now and barely moved 50 yards. This is genuinely ridiculous.

      I understand theres work being done but the impact this work is having is genuinely mental. The amount of times ive been late to college because of it is insane, and yes, i could get an earlier bus but with the time it takes to not only get into town, plus the traffic, thats like an extra hour and twenty minutes early id have to leave.

      Sorry for the rant but its getting on my nerves now

      submitted by /u/DotsV2
      [link] [comments]

    19. 🔗 badlogic/pi-mono v0.52.0 release

      New Features

      • Claude Opus 4.6 model support.
      • GPT-5.3 Codex model support (OpenAI Codex provider only).
      • SSH URL support for git packages. See docs/packages.md.
      • auth.json API keys now support shell command resolution (!command) and environment variable lookup. See docs/providers.md.
      • Model selectors now display the selected model name.

      Added

      • API keys in auth.json now support shell command resolution (!command) and environment variable lookup, matching the behavior in models.json
      • Added minimal-mode.ts example extension demonstrating how to override built-in tool rendering for a minimal display mode
      • Added Claude Opus 4.6 model to the model catalog
      • Added GPT-5.3 Codex model to the model catalog (OpenAI Codex provider only)
      • Added SSH URL support for git packages (#1287 by @markusn)
      • Model selectors now display the selected model name (#1275 by @haoqixu)

      Fixed

      • Fixed HTML export losing indentation in ANSI-rendered tool output (e.g. JSON code blocks in custom tool results) (#1269 by @aliou)
      • Fixed images being silently dropped when prompt() is called with both images and streamingBehavior during streaming. steer(), followUp(), and the corresponding RPC commands now accept optional images. (#1271 by @aliou)
      • CLI --help, --version, --list-models, and --export now exit even if extensions keep the event loop alive (#1285 by @ferologics)
      • Fixed crash when models send malformed tool arguments (objects instead of strings) (#1259)
      • Fixed custom message expand state not being respected (#1258 by @Gurpartap)
      • Fixed skill loader to respect .gitignore, .ignore, and .fdignore when scanning directories
    20. 🔗 gulbanana/gg GG 0.38.0 release

      This release is based on Jujutsu 0.38.

      Added

      • There's a toggle in the bottom left of the screen marked with a 🛡; turning it on acts like jj --ignore immutable, affording you infinite power.
      • The command-line argument --ignore-immutable will turn on the new toggle at startup.
      • The additional revsets displayed in the left-pane selector can by customised by adding config values under [gg.revsets].

      Changed

      • Temporary behavioural toggles are now represented with a sticky button instead of a checkbox.

      Fixed

      • In web mode, right-clicking on revisions enabled context menu commands based on the selected revision rather than the one you'd clicked.
    21. 🔗 r/Yorkshire York Minster rss

      York Minster | It felt so good to be back in God's county recently. I love York, it's one of my favourite cities of all time. submitted by /u/justchoo
      [link] [comments]
      ---|---

    22. 🔗 r/reverseengineering Hardware Hacking - $15 FRS Radio teardown - 1 hour video rss
    23. 🔗 r/Yorkshire Volunteer Opportunity 😎 rss

      Volunteer Opportunity 😎 | submitted by /u/IV_Sheffield
      [link] [comments]
      ---|---

    24. 🔗 r/reverseengineering dotNetPELoader——A C#-based PELoader for x64 and x86. rss
    25. 🔗 r/wiesbaden Where to buy very high end gaming PCs LOCAL. rss

      You could probably guess why, but I don’t have a German address. I need to buy a PC locally.

      submitted by /u/Careful-Foot8399
      [link] [comments]

    26. 🔗 r/LocalLLaMA We built an 8B world model that beats 402B Llama 4 by generating web code instead of pixels — open weights on HF rss

      We built an 8B world model that beats 402B Llama 4 by generating web code instead of pixels — open weights on HF | Hey r/LocalLLaMA, Here's something new for you: Mobile World Models.
      We just released gWorld — open-weight visual world models for mobile GUIs (8B and 32B). Demo Video Explanation: Here's gWorld 32B imagining a multi-step Booking dot com session — zero access to the real app:
      1. Sees flight search form (Detroit → Chicago)
      2. Click "Search" → writes code → renders full results page with airlines, prices, times
      3. Click destination field → predicts the search UI with history Every screen = executable HTML/CSS/JS rendered to pixels. The core idea: Instead of predicting the next screen as pixels (diffusion, autoregressive image gen), gWorld predicts it as executable web code. You render the code, you get the image. This sounds simple but it works remarkably well because VLMs already have strong priors on structured web code from pre-training. Why code instead of pixels?

      • Text-based world models lose visual fidelity (can't represent layouts, colors, images)
      • Pixel-generation models hallucinate text and structural elements
      • Code generation gives you the best of both: precise text rendering from linguistic priors + high-fidelity visuals from structured code

      Results on MWMBench (6 benchmarks, 4 ID + 2 OOD): | Model | Size | Avg Accuracy
      ---|---|---
      Qwen3 VL | 8B | 29.2%
      Llama 4 Scout | 109B (A17B) | 50.0%
      Llama 4 Maverick | 402B (A17B) | 55.7%
      Qwen3 VL | 235B (A22B) | 51.5%
      GLM-4.6V | 106B | 67.4%
      gWorld | 8B | 74.9%
      gWorld | 32B | 79.6%

      The 8B model beats everything up to 50× its size. Render failure rate is <1% (vs 40% for base Qwen3 VL 8B before our training).

      Other things worth noting:

      • Data scaling follows a power law with R² ≥ 0.94 — gains are predictable and nowhere near saturating
      • We include a Korean apps benchmark (KApps) as OOD eval — the models generalize well cross-lingually
      • The data pipeline is automated: repurpose existing trajectory data → cross-modal relabeling to code → synthetic reasoning traces
      • We also show that better world models → better downstream GUI agent performance

      Why this matters beyond benchmarks: The bottleneck for training GUI agents with online RL is device-policy coupling — every rollout needs a real Android emulator. World models could decouple this entirely, enabling massively parallel rollouts on pure compute. gWorld is a step in that direction.

      Links:

      Happy to answer questions.
      Built by Trillion Labs × KAIST AI.

      submitted by /u/jshin49
      [link] [comments]

    27. 🔗 @malcat@infosec.exchange Sometimes, the absence of signature match is also interesting. Here the mastodon

      Sometimes, the absence of signature match is also interesting. Here the hashtag#Chrysalis sideloaded dll, where we can quickly spot the few interesting functions.

      Make sure to check "Show UNK" !

    28. 🔗 jj-vcs/jj v0.38.0 release

      About

      jj is a Git-compatible version control system that is both simple and powerful. See
      the installation instructions to get started.

      Release highlights

      • Per-repo and per-workspace config is now stored outside the repo, for security
        reasons. This is not a breaking change because we automatically migrate
        legacy repos to this new format. .jj/repo/config.toml and
        .jj/workspace-config.toml should no longer be used.

      Breaking changes

      • The minimum supported git command version is now 2.41.0. macOS users will
        need to either upgrade "Developer Tools" to 26 or install Git from
        e.g. Homebrew.

      • Deprecated ui.always-allow-large-revsets setting and all: revset modifier
        have been removed.

      • <name>@<remote> revset symbols can also be resolved to remote tags. Tags are
        prioritized ahead of bookmarks.

      • Legacy placeholder support used for unset user.name or user.email has been
        removed. Commits containing these values will now be pushed with jj git push
        without producing an error.

      • If any side of a conflicted file is missing a terminating newline, then the
        materialized file in the working copy will no longer be terminated by a
        newline.

      Deprecations

      • The revset function diff_contains() has been renamed to diff_lines().

      New features

      • jj git fetch now shows details of abandoned commits (change IDs and
        descriptions) by default, matching the jj abandon output format.
        #3081

      • jj workspace root now accepts an optional --name argument to show
        the root path of the specified workspace (defaults to the current one). When
        given a workspace that was created before this release, it errors out.

      • jj git push --bookmark <name> will now automatically track the bookmark if
        it isn't tracked with any remote already.

      • Add git_web_url([remote]) template function that converts a git remote URL
        to a web URL, suitable for opening in a browser. Defaults to the "origin"
        remote.

      • New divergent() revset function for divergent changes.

      • String pattern values in revsets and templates can now be substituted by
        aliases. For example, grep(x) = description(regex:x) now works.

      • A new config option remotes.<name>.auto-track-created-bookmarks behaves
        similarly to auto-track-bookmarks, but it only applies to bookmarks created
        locally. Setting it to "*" is now the closest replacement for the deprecated
        git.push-new-bookmarks option.

      • jj tag list can now be filtered by revset.

      • Conflict markers will use LF or CRLF as the line ending according to the
        contents of the file.
        #7376

      • New experimental jj git fetch --tag flag to fetch tags in the same way as
        bookmarks. If specified, tags won't be fetched implicitly, and only tags
        matching the pattern will be fetched as <name>@<remote> tags. The fetched
        remote tags will be tracked by the local tags of the same name.

      • New remote_tags() revset function to query remote tags.

      • New builtin hyperlink() template function that gracefully falls back to
        text when outputting to a non-terminal, instead of emitting raw OSC 8 escape
        codes. #7592

      Fixed bugs

      • jj git init --colocate now refuses to run inside a Git worktree, providing
        a helpful error message with alternatives.
        #8052

      • jj git push now ensures that tracked remote bookmarks are updated even if
        there are no mappings in the Git fetch refspecs.
        #5115

      • jj git fetch/push now forwards most of git stderr outputs such as
        authentication requests. #5760

      • Conflicted bookmarks and tags in trunk() will no longer generate verbose
        warnings. The configured trunk() alias will temporarily be disabled.
        #8501

      • Dynamic shell completion for jj config unset now only completes
        configuration options which are set.
        #7774

      • Dynamic shell completion no longer attempts to resolve aliases at the
        completion position. This previously prevented a fully-typed alias from
        being accepted on some shells and replaced it entirely with its expansion on
        bash. Now, the completion will only resolve the alias, and suggest candidates
        accordingly, after the cursor has been advanced to the next position.
        #7773

      • Setting the editor via ui.editor, $EDITOR, or JJ_EDITOR now respects shell quoting.

      • jj gerrit upload will no longer swallow errors and surface if changes fail
        to get pushed to gerrit.
        #8568

      • jj file track --include-ignored now works when fsmonitor.backend="watchman".
        #8427

      • Conflict labels are now preserved correctly when restoring files from commits
        with different conflict labels.

      • The empty tree is now always written when the working copy is empty.
        #8480

      • When using the Watchman filesystem monitor, changes to .gitignore now trigger
        a scan of the affected subtree so newly unignored files are discovered.
        #8427

      • --quiet now hides progress bars.

      Contributors

      Thanks to the people who made this release happen!

    29. 🔗 r/Leeds Kitchen Space to Rent? Please Read!! rss

      Hiya! I’m a student in Leeds, currently trying to set up a cake business. I’ve got everything ready and I’m all good to go, except for the fact that annoyingly, as I’m in student accommodation, I cannot unfortunately run a business from my flat, as it’s against my contract.

      I also cannot afford a commercial kitchen, as the very cheapest I’ve found are £25 per hour, meaning for each cake, which takes around 3-4 hours to make, I’d have to charge £75-100 on top of the cost of ingredients and labour, essentially making it completely unviable.

      My question is, is there anybody in Leeds who has a kitchen that they would be willing to rent to me for 3-4 hours a week, sometimes less based on how many orders I receive, for about £20 a day? I know it’s not much but I’ve worked so hard to start this business and I can’t give up now! Plus I’d give you free cake!! I have a level 2 hygiene and cleaning certificate, and would clean before and after all cakes are made, you’d never even know I was there! I’d be entirely out of your hair.

      This request is also going out to any local kitchens, for example restaurants, cafes and bakery’s that may have available kitchen space during the day or at any point, I will cook at 3am if I have to!!

      Please please let me know if anyone has any suggestions, and if they may know anyone who would be willing to help out!

      I know this is an odd request but this business is my baby and I can’t lose it before it’s even begun!!

      Thanks 🤩

      submitted by /u/AnnualProfessional93
      [link] [comments]

    30. 🔗 Anton Zhiyanov (Un)portable defer in C rss

      Modern system programming languages, from Hare to Zig, seem to agree that defer is a must-have feature. It's hard to argue with that, because defer makes it much easier to free memory and other resources correctly, which is crucial in languages without garbage collection.

      The situation in C is different. There was a N2895 proposal by Jens Gustedt and Robert Seacord in 2021, but it was not accepted for C23. Now, there's another N3734 proposal by JeanHeyd Meneide, which will probably be accepted in the next standard version.

      Since defer isn't part of the standard, people have created lots of different implementations. Let's take a quick look at them and see if we can find the best one.

      C23/GCCC11/GCCGCC/ClangMSVCLong jumpFor loopStackSimplified GCC/ClangFinal thoughts

      C23/GCC

      Jens Gustedt offers this brief version:

      #define defer __DEFER(__COUNTER__)
      #define __DEFER(N) __DEFER_(N)
      #define __DEFER_(N) __DEFER__(__DEFER_FUNCTION_##N, __DEFER_VARIABLE_##N)
      
      #define __DEFER__(F, V)        \
          auto void F(int*);         \
          [[gnu::cleanup(F)]] int V; \
          auto void F(int*)
      

      Usage example:

      void loud_free(void* p) {
          printf("freeing %p\n", p);
          free(p);
      }
      
      int main(void) {
          int* p = malloc(sizeof(int));
          if (!p) return 1;
          defer { loud_free(p); }
      
          *p = 42;
          printf("p = %d\n", *p);
      }
      
      
      
      p = 42
      freeing 0x127e05b30
      

      This approach combines C23 attribute syntax ([[attribute]]) with GCC- specific features: nested functions (auto void F(int*)) and the cleanup attribute. It also uses the non-standard __COUNTER__ macro (supported by GCC, Clang, and MSVC), which expands to an automatically increasing integer value.

      Nested functions and cleanup in GCC

      A nested function (also known as a local function) is a function defined inside another function:

      void outer() {
          int x = 10;
      
          void inner() {
              x += 10;
          }
      
          inner();
      }
      

      Nested functions can access variables from the enclosing scope, similar to closures in other languages, but they are not first-class citizens and cannot be passed around like function pointers.

      The cleanup attribute runs a function when the variable goes out of scope:

      void safe_free(int **ptr) {
          if (!ptr || !*ptr) return;
          free(*ptr);
      }
      
      int main(void) {
          __attribute__((cleanup(safe_free))) int *p = malloc(sizeof(int));
          if (!p) return 1;
          *p = 42;
      
          // safe_free(&p) will be called automatically
          // when p goes out of scope.
      }
      

      The function should take one parameter, which is a pointer to a type that's compatible with the variable. If the function returns a value, it will be ignored.

      On the plus side, this version works just like you'd expect defer to work. On the downside, it's only available in C23+ and only works with GCC (not even Clang supports it, because of the nested function).

      Another downside is that using nested functions requires an executable stack , which security experts strongly discourage.

      Executable stack vulnerability

      When we use nested functions in GCC, the compiler often creates trampolines (small pieces of machine code) on the stack at runtime. These trampolines let the nested function access variables from the parent function's scope. For the CPU to run these code fragments, the stack's memory pages need to be marked as executable.

      An executable stack is a serious security risk because it makes buffer overflow attacks much easier. In these attacks, a hacker sends more data than a program can handle, which overwrites the stack with harmful "shellcode". If the stack non-executable (which is the standard today), the CPU won't run that code and the program will just crash. But since our defer macro makes the stack executable, an attacker can jump straight to their injected code and run it, giving them complete control over the process.

      C11/GCC

      We can easily adapt the above version to use C11:

      #define defer _DEFER(__COUNTER__)
      #define _DEFER(N) __DEFER(N)
      #define __DEFER(N) ___DEFER(__DEFER_FUNC_##N, __DEFER_VAR_##N)
      
      #define ___DEFER(F, V)                                         \
          auto void F(void*);                                        \
          __attribute__((cleanup(F))) int V __attribute__((unused)); \
          auto void F(void* _dummy_ptr)
      

      Usage example:

      int main(void) {
          int* p = malloc(sizeof(int));
          if (!p) return 1;
          defer { loud_free(p); }
      
          *p = 42;
          printf("p = %d\n", *p);
      }
      
      
      
      p = 42
      freeing 0x127e05b30
      

      The main downside remains: it's GCC-only.

      GCC/Clang

      Clang fully supports the cleanup attribute, but it doesn't support nested functions. Instead, it offers the blocks extension, which works somewhat similar:

      void outer() {
          __block int x = 10;
      
          void (^inner)(void) = ^{
              x += 10;
          };
      
          inner();
      }
      

      We can use Clang blocks to make a defer version that works with both GCC and Clang:

      #if defined(__clang__)
      
      // Clang implementation.
      #define _DEFER_CONCAT(a, b) a##b
      #define _DEFER_NAME(a, b) _DEFER_CONCAT(a, b)
      
      static inline void _defer_cleanup(void (^*block)(void)) {
          if (*block) (*block)();
      }
      
      #define defer                                                                   \
          __attribute__((unused)) void (^_DEFER_NAME(_defer_var_, __COUNTER__))(void) \
              __attribute__((cleanup(_defer_cleanup))) = ^
      
      #elif defined(__GNUC__)
      
      // GCC implementation.
      #define defer _DEFER(__COUNTER__)
      #define _DEFER(N) __DEFER(N)
      #define __DEFER(N) ___DEFER(__DEFER_FUNC_##N, __DEFER_VAR_##N)
      
      #define ___DEFER(F, V)                                         \
          auto void F(void*);                                        \
          __attribute__((cleanup(F))) int V __attribute__((unused)); \
          auto void F(void* _dummy_ptr)
      
      #else
      
      // Runtime error for unsupported compilers.
      #define defer assert(!"unsupported compiler");
      
      #endif
      

      Usage example:

      int main(void) {
          int* p = malloc(sizeof(int));
          if (!p) return 1;
          defer { loud_free(p); };
      
          *p = 42;
          printf("p = %d\n", *p);
      }
      
      
      
      p = 42
      freeing 0x127e05b30
      

      Now it works with Clang, but there are several things to be aware of:

      1. We must compile with -fblocks.
      2. We must put a ; after the closing brace in the deferred block: defer { ... };.
      3. If we need to modify a variable inside the defer block, the variable must be declared with __block:

        __block int x = 0; defer { x += 10; };

      On the plus side, this implementation works with both GCC and Clang. The downside is that it's still not standard C, and won't work with other compilers like MSVC.

      MSVC

      MSVC, of course, doesn't support the cleanup attribute. But it provides "structured exception handling" with the __try and __finally keywords:

      int main(void) {
          int* p = malloc(sizeof(int));
          if (!p) return 1;
          __try {
              *p = 42;
              printf("p = %d\n", *p);
          }
          __finally {
              loud_free(p);
          }
      }
      

      The code in the __finally block will always run, no matter how the __try block exits — whether it finishes normally, returns early, or crashes (for example, from a null pointer dereference).

      This isn't the defer we're looking for, but it's a decent alternative if you're only programming for Windows.

      Long jump

      There are well-known defer implementations by Jens Gustedt and moon- chilled that use setjmp and longjmp. I'm mentioning them for completeness, but honestly, I would never use them in production. The first one is extremely large, and the second one is extremely hacky. Also, I'd rather not use long jumps unless it's absolutely necessary.

      Still, here's a usage example from Gustedt's library:

      guard {
          void * const p = malloc(25);
          if (!p) break;
          defer free(p);
      
          void * const q = malloc(25);
          if (!q) break;
          defer free(q);
      
          if (mtx_lock(&mut)==thrd_error) break;
          defer mtx_unlock(&mut);
      }
      

      Here, all deferred statements run at the end of the guarded block, no matter how we exit the block (normally or through break).

      For loop

      The stc library probably has the simplest defer implementation ever:

      #define defer(...) \
          for (int _c_i3 = 0; _c_i3++ == 0; __VA_ARGS__)
      

      Usage example:

      int main(void) {
          int* p = malloc(sizeof(int));
          if (!p) return 1;
          defer(loud_free(p)) {
              *p = 42;
              printf("p = %d\n", *p);
          }
      }
      
      
      
      p = 42
      freeing 0x127e05b30
      

      Here, the deferred statement is passed as __VA_ARGS__ and is used as the loop increment. The "defer-aware" block of code is the loop body. Since the increment runs after the body, the deferred statement executes after the main code.

      This approach works with all mainstream compilers, but it falls apart if you try to exit early with break or return:

      int main(void) {
          int* p = malloc(sizeof(int));
          if (!p) return 1;
          defer(loud_free(p)) {
              *p = 42;
              if (*p == 42) {
                  printf("early exit, defer is not called\n");
                  break;
              }
              printf("p = %d\n", *p);
          }
      }
      
      
      
      early exit, defer is not called
      

      Stack

      Dmitriy Kubyshkin provides a defer implementation that adds a "stack frame" of deferred calls to any function that needs them. Here's a simplified version:

      #define countof(A) ((sizeof(A)) / (sizeof((A)[0])))
      
      // Deferred function and its argument.
      struct _defer_ctx {
          void (*fn)(void*);
          void* arg;
      };
      
      // Calls all deferred functions in LIFO order.
      static inline void _defer_drain(
          const struct _defer_ctx* it,
          const struct _defer_ctx* end) {
          for (; it != end; it++) it->fn(it->arg);
      }
      
      // Initializes the defer stack with the given size
      // for the current function.
      #define defers(n)                     \
          struct {                          \
              struct _defer_ctx* first;     \
              struct _defer_ctx items[(n)]; \
          } _deferred = {&_deferred.items[(n)], {0}}
      
      // Pushes a deferred function call onto the stack.
      #define defer(_fn, _arg)                              \
          do {                                              \
              if (_deferred.first <= &_deferred.items[0]) { \
                  assert(!"defer stack overflow");          \
              }                                             \
              struct _defer_ctx* d = --_deferred.first;     \
              d->fn = (void (*)(void*))(_fn);               \
              d->arg = (void*)(_arg);                       \
          } while (0)
      
      // Calls all deferred functions and returns from the current function.
      #define returnd                                          \
          while (                                              \
              _defer_drain(                                    \
                  _deferred.first,                             \
                  &_deferred.items[countof(_deferred.items)]), \
              1) return
      

      Usage example:

      int main(void) {
          // The function supports up to 16 deferred calls.
          defers(16);
      
          int* p = malloc(sizeof(int));
          if (!p) returnd 1;
          defer(loud_free, p);
      
          *p = 42;
          printf("p = %d\n", *p);
      
          // We must exit through returnd to
          // ensure deferred functions are called.
          returnd 0;
      }
      
      
      
      p = 42
      freeing 0x127e05b30
      

      This version works with all mainstream compilers. Also, unlike the STC version, defers run correctly in case of early exit:

      int main(void) {
          defers(16);
      
          int* p = malloc(sizeof(int));
          if (!p) returnd 1;
          defer(loud_free, p);
      
          *p = 42;
          if (*p == 42) {
              printf("early exit\n");
              returnd 0;
          }
      
          printf("p = %d\n", *p);
          returnd 0;
      }
      
      
      
      early exit
      freeing 0x127e05b30
      

      Unfortunately, there are some drawbacks:

      • Defer only supports single-function calls, not code blocks.
      • We always have to call defers at the start of the function and exit using returnd. In the original implementation, Dmitriy overrides the return keyword, but this won't compile with strict compile flags (which I think we should always use).
      • The deferred function runs before the return value is evaluated, not after.

      Simplified GCC/Clang

      The Stack version above doesn't support deferring code blocks. In my opinion, that's not a problem, since most defers are just "free this resource" actions, which only need a single function call with one argument.

      If we accept this limitation, we can simplify the GCC/Clang version by dropping GCC's nested functions and Clang's blocks:

      #define _DEFER_CONCAT(a, b) a##b
      #define _DEFER_NAME(a, b) _DEFER_CONCAT(a, b)
      
      // Deferred function and its argument.
      struct _defer_ctx {
          void (*fn)(void*);
          void* arg;
      };
      
      // Calls the deferred function with its argument.
      static inline void _defer_cleanup(struct _defer_ctx* ctx) {
          if (ctx->fn) ctx->fn(ctx->arg);
      }
      
      // Create a deferred function call for the current scope.
      #define defer(fn, ptr)                                      \
          struct _defer_ctx _DEFER_NAME(_defer_var_, __COUNTER__) \
              __attribute__((cleanup(_defer_cleanup))) =          \
                  {(void (*)(void*))(fn), (void*)(ptr)}
      

      Works like a charm:

      int main(void) {
          int* p = malloc(sizeof(int));
          if (!p) return 1;
          defer(loud_free, p);
      
          *p = 42;
          printf("p = %d\n", *p);
      }
      
      
      
      p = 42
      freeing 0x127e05b30
      

      Final thoughts

      Personally, I like the simpler GCC/Clang version better. Not having MSVC support isn't a big deal, since we can run GCC on Windows or use the Zig compiler, which works just fine.

      But if I really need to support GCC, Clang, and MSVC — I'd probably go with the Stack version.

      Anyway, I don't think we need to wait for defer to be added to the C standard. We already have defer at home!

    31. 🔗 r/Yorkshire Bridlington this morning rss

      Bridlington this morning | submitted by /u/Charlatans1969
      [link] [comments]
      ---|---

    32. 🔗 r/Yorkshire York Gate Garden (Nr. Leeds) rss

      York Gate Garden (Nr. Leeds) | submitted by /u/arioandy
      [link] [comments]
      ---|---

    33. 🔗 HexRaysSA/plugin-repository commits sync plugin-repository.json rss
      sync plugin-repository.json
      
      No plugin changes detected
      
    34. 🔗 r/york Gyms rss

      Is there any good and cheap gyms, mostly looking for affordable gyms for students

      submitted by /u/MarzipanNo3989
      [link] [comments]

    35. 🔗 r/LocalLLaMA Google Research announces Sequential Attention: Making AI models leaner and faster without sacrificing accuracy rss

      Google Research announces Sequential Attention: Making AI models leaner and faster without sacrificing accuracy | submitted by /u/Fear_ltself
      [link] [comments]
      ---|---

    36. 🔗 r/LocalLLaMA Qwen3-Coder-Next on RTX 5060 Ti 16 GB - Some numbers rss

      Qwen3-Coder-Next on RTX 5060 Ti 16 GB - Some numbers | About 2 weeks ago, I posted about running GLM-4.7-Flash on 16 GB of VRAM here www.reddit.com/r/LocalLLaMA/comments/1qlanzn/glm47flashreap_on_rtx_5060_ti_16_gb_200k_context/. And here we go, today, let's squeeze an even bigger model into the poor rig. Hardware:

      • AMD Ryzen 7 7700X
      • RAM 32 GB DDR5-6000
      • RTX 5060 Ti 16 GB

      Model: unsloth/Qwen3-Coder-Next-GGUF Q3_K_M Llama.cpp version: llama.cpp@b7940 The llamap.cpp command:

      llama-server -m ./Qwen3-Coder-Next-Q3_K_M.gguf -c 32768 -np 1 -t 8 --temp 1.0 --top-p 0.95 --top-k 40 --min-p 0.01 --jinja --fit on -fa 1
      

      When I started, I didn't expect much, given that my best result for GLM-4.7-Flash was something ~300 t/s pp and 14 t/s gen. Maybe I'll end up with a lot of OOM and crash. But, to my surprise, the card was able to pull it well! When llama.cpp is fully loaded, it takes 15.1 GB GPU memory, and 30.2 GB RAM. The rig is almost at its memory limit. During prompt processing, GPU usage was about 35% , and CPU usage was about 15%. During token generation, that's 45% for the GPU, and 25%-45% CPU. So perhaps there are some room to squeeze in some tuning here. Does it run? Yes, and it's quite fast for a 5060! | Metric | Task 2 (Large Context) | Task 190 (Med Context) | Task 327 (Small Context)
      ---|---|---|---
      Prompt Eval (Prefill) | 154.08 t/s | 225.14 t/s | 118.98 t/s
      Generation (Decode) | 16.90 t/s | 16.82 t/s | 18.46 t/s

      The above run was with a 32k context size. Later on, I tried again with a 64k context size, the speed did not change much.

      Is it usable? I'd say yes, not Opus 4.5 or Gemini Flash usable, but I think it's pretty close to my experience when Claude Sonnet 3.7 or 4 was still a thing.

      One thing that sticks out is, this model uses way less tool calls than Opus, so it feels fast. It seems to read the whole file all at once when needed, rather than grepping every 200 lines like the Claude brothers.

      One-shot something seems to work pretty well, until it runs into bugs. In my example, I asked the model to create a web-based chess game with a Python backend, connected via WebSocket. The model showed that it can debug the problem by jumping back and forth between frontend and backend code very well.

      When facing a problem, it will first hypothesize a cause, then work its way through the code to verify that. Then there will be a lot of "But wait", "Hold on", followed by a tool call to read some files, and then changing directions. Sometimes it works. Sometimes, it was just burning through the tokens and ended up reaching the context limit. Maybe because I was using Q3_K_M, and higher quants will have better quality here.

      Some screenshots:

      https://gist.github.com/user- attachments/assets/8d074a76-c441-42df-b146-0ae291af17df

      https://gist.github.com/user- attachments/assets/3aa3a845-96cd-4b23-b6d9-1255036106db

      You can see the Claude session logs and llama.cpp logs of the run here https://gist.github.com/huytd/6b1e9f2271dd677346430c1b92893b57

      Update: So, I managed to get some time sit down and run some tests again. This time, I'm trying to see what's the sweet spot for --n-cpu-moe. This big *ss model has 512 expert layers, I'll start with ncmoe = 16.

      % llama-bench -m ./Qwen3-Coder-Next-Q3_K_M.gguf -ngl 99 -ncmoe 16 -fa 1 -t 8 --mmap 0 --no-warmup | model | size | params | backend | ngl | fa | test | t/s | | ------------------------------ | ---------: | ---------: | ---------- | --: | -: | --------------: | -------------------: | | qwen3next 80B.A3B Q3_K - Medium | 35.65 GiB | 79.67 B | CUDA | 99 | 1 | pp512 | 269.74 ± 57.76 | | qwen3next 80B.A3B Q3_K - Medium | 35.65 GiB | 79.67 B | CUDA | 99 | 1 | tg128 | 5.51 ± 0.03 |
      

      Definitely a no-go, the weights filled up the whole GPU and fully spilled over to the shared GPU mem, extremely slow.

      Let's do 64 then.

      % llama-bench -m ./Qwen3-Coder-Next-Q3_K_M.gguf -ngl 99 -ncmoe 64 -fa 1 -t 8 --no-warmup ggml_cuda_init: found 1 CUDA devices: | model | size | params | backend | ngl | fa | test | t/s | | ------------------------------ | ---------: | ---------: | ---------- | --: | -: | --------------: | -------------------: | | qwen3next 80B.A3B Q3_K - Medium | 35.65 GiB | 79.67 B | CUDA | 99 | 1 | pp512 | 21.23 ± 12.52 | | qwen3next 80B.A3B Q3_K - Medium | 35.65 GiB | 79.67 B | CUDA | 99 | 1 | tg128 | 12.45 ± 0.79 |
      

      What's happening here is, we get better tg speed, but pp dropped. The GPU was under-utilized, only half of the VRAM was filled.

      Back to ncmoe = 32 seems to work, no more spill over to the slow shared GPU mem, everything fits nicely in the GPU mem and the system mem.

      % llama-bench -m ./Qwen3-Coder-Next-Q3_K_M.gguf -ngl 99 -ncmoe 32 -fa 1 -t 8 --mmap 0 --no-warmup | model | size | params | backend | ngl | fa | test | t/s | | ------------------------------ | ---------: | ---------: | ---------- | --: | -: | --------------: | -------------------: | | qwen3next 80B.A3B Q3_K - Medium | 35.65 GiB | 79.67 B | CUDA | 99 | 1 | pp512 | 275.89 ± 65.48 | | qwen3next 80B.A3B Q3_K - Medium | 35.65 GiB | 79.67 B | CUDA | 99 | 1 | tg128 | 20.21 ± 0.57 |
      

      So 32 was a safe number, let's try something lower, like 28:

      % llama-bench -m ./Qwen3-Coder-Next-Q3_K_M.gguf -ngl 99 -ncmoe 28 -fa 1 -t 8 --mmap 0 --no-warmup | model | size | params | backend | ngl | fa | test | t/s | | ------------------------------ | ---------: | ---------: | ---------- | --: | -: | --------------: | -------------------: | | qwen3next 80B.A3B Q3_K - Medium | 35.65 GiB | 79.67 B | CUDA | 99 | 1 | pp512 | 253.92 ± 59.39 | | qwen3next 80B.A3B Q3_K - Medium | 35.65 GiB | 79.67 B | CUDA | 99 | 1 | tg128 | 7.92 ± 0.13 |
      

      Nope! spilled over to the slow shared GPU mem again. Let's bump it up to, like, 30:

      % llama-bench -m ./Qwen3-Coder-Next-Q3_K_M.gguf -ngl 99 -ncmoe 30 -fa 1 -t 8 --mmap 0 --no-warmup | model | size | params | backend | ngl | fa | test | t/s | | ------------------------------ | ---------: | ---------: | ---------- | --: | -: | --------------: | -------------------: | | qwen3next 80B.A3B Q3_K - Medium | 35.65 GiB | 79.67 B | CUDA | 99 | 1 | pp512 | 296.60 ± 73.63 | | qwen3next 80B.A3B Q3_K - Medium | 35.65 GiB | 79.67 B | CUDA | 99 | 1 | tg128 | 20.15 ± 1.06 |
      

      So I think this is the sweet spot for RTX 5060 Ti on this Q3_K_M quant. pp at 296.60 t/s and tg at 20.15 t/s.

      Q3_K_M performance

      submitted by /u/bobaburger
      [link] [comments]

    37. 🔗 r/Leeds help my dother prom rss

      My daughter’s prom is coming up this year and she wants a goth‑style dress. I’m completely clueless about that style (I’m a very girly girl myself )but I really want her to be happy. We’ve ordered a few things from eBay and I think she likes the style, but the fit just isn’t good. Not because of her preferences, just the quality and sizing. I want her to look good and good fitted dress can do a wounder

      Is there anywhere we can actually go in person to try on goth‑style dresses? Every shop I visit is full of pink, puffy, fitted dresses that just aren’t her vibe.

      submitted by /u/JammyD0dgers
      [link] [comments]

    38. 🔗 HexRaysSA/plugin-repository commits sync repo: +1 release rss
      sync repo: +1 release
      
      ## New releases
      - [IDASQL](https://github.com/allthingsida/idasql): 0.0.1
      
    39. 🔗 Console.dev newsletter memlab rss

      Description: Find JS memory leaks.

      What we like: Supports analysis of Chrome browsers, Electron, and NodeJS. Uses the Puppeteer API to automate memory analysis using browsers. Create files defining how to interact with pages. Can be used as an NPM package to run end to end tests. Includes a visual debugger.

      What we dislike: Only supports Chromium-based browsers.

    40. 🔗 Console.dev newsletter Whosthere rss

      Description: LAN discovery tool.

      What we like: Scans your local network (mDNS and SSDP) to find devices, identifying them using ARP and manufacturer metadata lookup. Doesn’t require elevated privileges. Can also (optionally) scan ports. Built as a TUI, but can also run in the background with a queryable API. Supports themes.

      What we dislike: Designed as a TUI so the CLI command is more limited.

    41. 🔗 Mitchell Hashimoto My AI Adoption Journey rss
      (empty)