🏡


to read (pdf)

  1. Letting AI Actively Manage Its Own Context | 明天的乌云
  2. Garden Offices for Sale UK - Portable Space
  3. Cord: Coordinating Trees of AI Agents | June Kim
  4. Style tips for less experienced developers coding with AI · honnibal.dev
  5. Haskell for all: Beyond agentic coding

  1. March 12, 2026
    1. 🔗 r/york Living on bishopthorpe road rss

      My partner and I are currently looking to move house. Around bishopthorpe road has always been an option however, realistically we would not be getting much for our money and therefore would need to compromise on space etc to live there.

      Does anyone have experience on living there and is it worth the premium price?

      Thanks!

      submitted by /u/Bubbly-Biscotti9744
      [link] [comments]

    2. 🔗 r/york Moving to York rss

      Hiya! My wife and I are looking to move and buy a house near York city in the near future!

      We would love to hear what places surrounding York are nice and good places to buy in.

      Thanks!

      submitted by /u/wawewia
      [link] [comments]

    3. 🔗 vercel-labs/agent-browser v0.18.0 release

      Minor Changes 942b8cd: ### New Features inspect command - Opens Chrome DevTools for the active page by launching a local proxy server that forwards the DevTools frontend to the browser's CDP WebSocket. Commands continue to work while DevTools is open. Implemented in both Node.js and native paths. (#736) get cdp-url subcommand - Retrieve the Chrome DevTools Protocol WebSocket URL for the active page, useful for external debugging tools. (#736) Native screenshot annotate - The --annotate flag for screenshots now works in the native Rust daemon, bringing parity with the Node.js path. (#706) Improvements * **KERNEL_API_KEY now optional** \- External credential injection no longer requires `KERNEL_API_KEY` to be set, making it easier to use Kernel with pre-configured environments. () * **Browserbase simplified** \- Removed the `BROWSERBASE_PROJECT_ID` requirement, reducing setup friction for Browserbase users. ([#625](https://github.com/vercel-labs/agent-browser/pull/625)) Bug Fixes

      * Fixed Browserbase API using incorrect endpoint to release sessions ([#707](https://github.com/vercel-labs/agent-browser/pull/707))
      * Fixed CDP connect paths using hardcoded 10s timeout instead of `getDefaultTimeout()` ([#704](https://github.com/vercel-labs/agent-browser/pull/704))
      * Fixed lone Unicode surrogates causing errors by sanitizing with `toWellFormed()` ([#720](https://github.com/vercel-labs/agent-browser/pull/720))
      * Fixed CDP connection failure on IPv6-first systems ([#717](https://github.com/vercel-labs/agent-browser/pull/717))
      * Fixed recordings not inheriting the current viewport settings ([#718](https://github.com/vercel-labs/agent-browser/pull/718))
      
    4. 🔗 HexRaysSA/plugin-repository commits sync repo: +1 plugin, +3 releases, ~3 changed rss
      sync repo: +1 plugin, +3 releases, ~3 changed
      
      ## New plugins
      - [HashDB](https://github.com/OALabs/hashdb-ida) (1.10.0)
      
      ## New releases
      - [DBImporter](https://github.com/HexRaysSA/ida-dbimporter): 0.0.2
      - [Suture](https://github.com/libtero/suture): 1.2.0
      
      ## Changes
      - [bindiff](https://github.com/HexRays-plugin-contributions/bindiff):
        - 8.0.0: download URL changed
      - [binexport](https://github.com/HexRays-plugin-contributions/binexport):
        - 12.0.0: download URL changed
      - [xray](https://github.com/HexRays-plugin-contributions/xray):
        - 2025.9.24: download URL changed
      
    5. 🔗 r/reverseengineering Reverse Engineering the undocumented ResetEngine.dll: A C++ tool to programmatically trigger a silent Windows Factory Reset (PBR) bypassing SystemSettings UI. rss
    6. 🔗 r/Yorkshire The Life of Chuck rss

      The Life of Chuck | Just started watching this on Netflix.... this is what they think North Yorkshire looks like? submitted by /u/Neffwood
      [link] [comments]
      ---|---

    7. 🔗 r/reverseengineering Near complete hypervisor, driver, and system binary analysis for the Xbox Series consoles rss
    8. 🔗 apple/embedding-atlas v0.18.1 release

      What's Changed

      Full Changelog : v0.18.0...v0.18.1

    9. 🔗 r/york Yorks Royal Chamberpot rss

      Yorks Royal Chamberpot | Charles Il chamberpot made by Marmaduke Best, York. Marmaduke Rawdon gave the City of York a "silver chamber pott of the value of ten punds". In 1850, Queen Victoria’s husband, Prince Albert, visited the Mansion House and may have used the chamberpot! submitted by /u/York_shireman
      [link] [comments]
      ---|---

    10. 🔗 r/Leeds Anyone looking for more Alt/Rock Friends? like Key Club, Spoons, NQ64, Pixel Bar etc?.. Join our Alt/Rock/Emo Whatsapp Social Group! xo rss

      Love Keyclub (Slamdunk, FUEL, GARAGE Clubnights), NQ64, Pixel Bar, Wetherspoons, Pubs etc but have a lack of alternative friends to go with? Just want to make more alternative friends, have fun chats & get involved in social events?

      A few of us from Reddit, Facebook etc have banded together from previous appeals and have a new fun Whatsapp Alt/Rock/Emo Social Group chat now, 80+ members and counting!

      We had a successful recruitment on here a few months ago which blew up & got overwhelming so had to trickle people in but there are too many to go through, so starting a new fresh post to add more people

      The group is roughly 18-35 age range & currently around 50/50 gender mix so plenty of people of different age/genders etc, very inclusive and everyone is getting on great together.

      We have regular nights out especially on Weekends (Keyclub Club Nights, Spoons, Bars, NQ64, Pixel Bar, Flight Club, Cinema trips.. anything fun really!) which can get anywhere from 10-15 people attending. Spoons & Key Club on Saturdays is a particular fave. but we are always planning social events, mid week chill things etc

      If you'd like to join then leave a comment with your age/gender & I'll DM you an invite! all welcome

      I will invite in slowly as to keep the ratio of ages, sex etc balanced so theres always people of similar age etc

      Leave a comment & I'll DM an invite when available! x

      submitted by /u/rmonkey100
      [link] [comments]

    11. 🔗 r/reverseengineering Live From RE//verse 2026: WARP Signatures with Mason Reed (Stream - 06/03/2026) rss
    12. 🔗 backnotprop/plannotator v0.12.0 release

      Follow @plannotator on X for updates

      Claude Code users, want to give feedback on approval? Please upvote & comment here.


      Missed recent releases? Release | Highlights
      ---|---
      v0.11.4 | Git add from code review, bidirectional scroll navigation, clipboard paste for annotation images, VS Code IPC port stability
      v0.11.3 | Expandable diff context, hierarchical folder tree, redesigned worktree controls, supply chain hardening
      v0.11.2 | Git worktree support in code review, VS Code editor annotations in review, Obsidian auto-save & separator settings, session discovery, smart file resolution
      v0.11.1 | VS Code extension for in-editor plan review, Pinpoint mode for point-and-click annotations, untracked files in code review
      v0.11.0 | Auto-save annotation drafts, comment popover, Obsidian vault browser, deny message framing fix, configurable OpenCode timeout
      v0.10.0 | Short URL sharing with E2E encryption, code suggestions in review UI, CJK input method support, customizable Obsidian filenames, XDG install fix
      v0.9.3 | Linked document navigation & annotation, VS Code diff integration, toolbar dismiss fix, automated npm publishing
      v0.9.0 | Plan Diff with two view modes, version history, sidebar redesign, terminology cleanup
      v0.8.5 | Pi coding agent support, auto-close countdown, image endpoint security fix, OpenCode package fix
      v0.8.0 | Open source (MIT/Apache-2.0), annotate command, self-hosted share portal, resizable panels, mermaid controls, auto-close on approval, documentation site


      What's New in v0.12.0 This is a community release. Ten of the fourteen PRs in v0.12.0 were authored by external contributors, spanning three major features and a sweep of cross- platform fixes. The annotation system gained preset labels for one-click feedback — no typing, just click and move on. The plan viewer now renders Graphviz diagrams alongside Mermaid, inline markdown images with a lightbox zoom, and renders all diagrams by default instead of showing raw source. And the entire UI works on mobile. Quick Annotation Labels Reviewing a plan often means the same feedback applies to multiple sections — "clarify this," "verify this assumption," "match existing patterns." Quick Labels turn those into one-click preset chips that appear above the annotation toolbar. Select text, click a label, done. No typing required. Ten default labels ship out of the box, each with an emoji and a color-coded pill: ❓ Clarify this · 🗺️ Missing overview · 🔍 Verify this · 🔬 Give me an example · 🧬 Match existing patterns · 🔄 Consider alternatives · 📉 Ensure no regression · 🚫 Out of scope · 🧪 Needs tests · 👍 Nice approach Several labels carry agent-facing tips that get injected into the feedback. For example, selecting a section and clicking "🔍 Verify this" tells the agent: "This seems like an assumption. Verify by reading the actual code before proceeding." The "🧬 Match existing patterns" label instructs the agent to search the codebase for existing solutions rather than introducing a new approach. These tips are invisible to the reviewer but shape how the agent responds. When the feedback is exported, labeled annotations are grouped into a Label Summary section at the bottom — **🔍 Verify this**: 3 — so both the reviewer and the agent can see at a glance which patterns recur across the plan. Labels are fully customizable in Settings. Add up to 12, reorder them, pick custom colors and tips, or remove the ones you never use. Settings persist across sessions via cookies. A follow-up PR introduced a dedicated Quick Label editing mode alongside Markup, Comment, and Redline. In this mode, selecting text immediately shows a floating label picker — no toolbar intermediary. Alt+1 through Alt+0 keyboard shortcuts work in any mode for power users who prefer not to reach for the mouse. Authored by @grubmanItay in #268 and #272 Mobile Compatibility Plannotator was desktop-only. That mattered less when the tool was purely a local dev workflow, but with shared URLs and team reviews becoming common, people were opening plan links on phones and tablets and getting a broken layout. The UI now adapts fully below 768px. The header collapses into a hamburger menu. The annotation panel renders as a full-screen overlay with a backdrop and close button. Touch support covers resize handles, pinpoint annotations, text selection, and the toolstrip. Card action buttons are always visible on touch devices instead of appearing on hover. The Settings modal switches to a horizontal tab bar. The CommentPopover width is capped to the viewport so it doesn't overflow off-screen. Desktop layout is completely unchanged — this is additive, not a redesign. Authored by @grubmanItay in #260 Graphviz Diagram Rendering Plannotator has supported Mermaid diagrams since v0.6.8. Plans that use Graphviz for architecture diagrams, dependency graphs, or state machines were stuck with raw DOT source in a code block. The Viewer now renders graphviz, dot, and gv fenced code blocks using @viz-js/viz, with the same UX conventions as Mermaid: source/diagram toggle, zoom and pan controls, and an expanded fullscreen view. Authored by @flex-yj-kim in #266 Mermaid Diagram Improvements The Mermaid viewer received a substantial UX overhaul. Diagrams now open in a proper expanded fullscreen mode with zoom in/out, fit-to-view, and wheel zoom. The source/diagram toggle was reworked for clarity. Wide diagrams no longer clip against container edges in both plan view and plan diff view. Safari stability issues with SVG rendering were resolved. A separate PR changed both Mermaid and Graphviz diagrams to render by default instead of showing raw source code first — the source toggle is still one click away, but the visual rendering is now the default state. Authored by @flex-yj-kim in #264 and #279 Issue #275 filed by @flex-yj-kim Markdown Image Rendering Markdown ! syntax was silently treated as plain text — the ! character wasn't in the inline scanner, so images never rendered. They do now. Local image paths are proxied through the existing /api/image endpoint, and relative paths resolve correctly when annotating files outside the project root. Clicking any rendered image opens a full-screen lightbox with the alt text as a caption. Press Escape or click the backdrop to dismiss. Authored by @dgrissen2 in #271 Linked Doc Navigation in Annotate Mode

      The /plannotator-annotate command lets you annotate any markdown file, but clicking .md links inside that file would break — the annotate server was missing a /api/doc endpoint, so link requests returned raw HTML instead of JSON. This release adds the missing route and supports chained relative link navigation, so you can follow links between sibling markdown files without leaving annotate mode.

      VS Code Extension in SSH Remote Sessions

      The VS Code extension sets PLANNOTATOR_BROWSER to its own open-in-vscode handler so plans open in editor tabs instead of external browsers. In SSH remote sessions, the shared openBrowser() function skipped browser launch entirely — ignoring the custom handler. The fix is a one-line condition change: if PLANNOTATOR_BROWSER is set, always call openBrowser() regardless of remote detection. This covers plan review, code review, and annotate mode.

      Additional Changes

      • Windows markdown path supportplannotator annotate now handles Windows drive-letter paths (C:\..., C:/...), Git Bash/MSYS paths (/c/...), and Cygwin paths (/cygdrive/c/...) in the shared markdown resolver (#267 by @flex-yj-kim)
      • OS-aware update banner — the update banner now detects the user's OS and shows the correct install command: bash/curl on macOS and Linux, PowerShell on Windows (#270, reported by @eromoe in #265)
      • Pi origin in code review — the code review UI now recognizes Pi as a first-class origin with a violet badge, correct install command in the update banner, and proper agent name in the completion overlay (#263)
      • Codex support — documentation and install instructions for running Plannotator inside Codex, which uses the CLI directly without a plugin (#261)
      • Welcome dialog cleanup — removed three first-run dialogs (UI Features Setup, Plan Diff Marketing, What's New v0.11.0) that had outlived their usefulness. The only remaining first-open dialog is the Permission Mode Setup, which directly affects agent behavior (#280)

      Install / Update

      macOS / Linux:

      curl -fsSL https://plannotator.ai/install.sh | bash
      

      Windows:

      irm https://plannotator.ai/install.ps1 | iex
      

      Claude Code Plugin: Run /plugin in Claude Code, find plannotator , and click "Update now".

      OpenCode: Clear cache and restart:

      rm -rf ~/.bun/install/cache/@plannotator
      

      Then in opencode.json:

      {
        "plugin": ["@plannotator/opencode@latest"]
      }
      

      Pi: Install or update the extension:

      pi install npm:@plannotator/pi-extension
      

      What's Changed

      Contributors

      @grubmanItay was a major contributor to this release with three PRs — Quick Annotation Labels, Quick Label Mode, and full mobile support. The labels system touched the annotation pipeline end-to-end: new UI components, settings persistence, keyboard shortcuts, export formatting, and share URL backward compatibility.

      @flex-yj-kim continues as the project's most prolific external contributor. Four PRs in this release: Graphviz rendering, Mermaid viewer overhaul, render-by-default diagrams, and Windows path support. Across v0.9.3 through v0.12.0, Yeongjin has authored twelve merged PRs spanning both the plan and code review UIs.

      @dgrissen2 returns and shipped two PRs — markdown image rendering with the lightbox viewer and the annotate-mode linked doc navigation fix. Both address gaps where the viewer silently dropped content instead of rendering it.

      @7tg who originated the VS Code extension, authored the SSH remote fix for the VS Code extension, which he also reported in #259 with a thorough diagnostic of the underlying IPC issue.

      Community members who reported issues and participated in discussions that shaped this release:

      Full Changelog : v0.11.4...v0.12.0

    13. 🔗 sacha chua :: living an awesome life Small steps towards using OpenAI-compatible text-to-speech services with speechd-el or emacspeak rss

      Speech synthesis has come a long way since I first tried out Emacspeak in 2002. Kokoro TTS and Piper offer more natural-sounding voices now, although the initial delay in loading the models and generating speech mean that they aren't quite ready to completely replace espeak, which is faster but more robotic. I've been using the Kokoro FastAPI through my own functions for working with various speech systems. I wanted to see if I could get Kokoro and other OpenAI-compatible text-to-speech services to work with either speechd-el or Emacspeak just in case I could take advantage of the rich functionality either provides for speech-synthesized Emacs use. speechd-el is easier to layer on top of an existing Emacs if you only want occasional speech, while emacspeak voice-enables many packages to an extent beyond speaking simply what's on the screen.

      Speech synthesis is particularly helpful when I'm learning French because I can use it as a reference for what a paragraph or sentence should sound like. It's not perfect. Sometimes it uses liaisons that my tutor and Google Translate don't use. But it's a decent enough starting point. I also used it before to read out IRC mentions and compile notifications so that I could hear them even if I was paying attention to a different activity.

      Here's a demonstration of speechd reading out the following lines using the code I've just uploaded to https://codeberg.org/sachac/speechd-ai:

      • The quick brown fox jumps over the lazy dog.
      • Now let's set the language to French so we can read the next line.
      • Bonjour, je m'appelle Emacs.

      Screencast showing speechd-el

      There's about a 2-second delay between the command and the start of the audio for the sentence.

      Note that speechd-speak-read-sentence fails in some cases where (forward-sentence 1) isn't the same place as (backward-sentence 1) (forward-sentence 1), which can happen when you're in an Org Mode list. I've submitted a patch upstream.

      Aside from that, speechd-speak-set-language, speechd-speak-read-paragraph and speechd-speak-read-region are also useful commands. I think the latency makes this best-suited for reading paragraphs, or for shadowing sentences for language learning.

      I'm still trying to figure out how to get speechd-speak to work as smoothly as I'd like. I think I've got it set up so that the server falls back to espeak for short texts so that it can handle words or characters better, and uses the specified server for longer ones. I'd like to get to the point where it can handle all the things that speechd usually does, like saying lines as I navigate through them or giving me feedback as I'm typing. Maybe it can use espeak for fast feedback character by character and word by word, and then use Kokoro TTS for the full sentence when I finish. Then it will be possible to use it to type things without looking at the screen.

      After putting this together, I still find myself leaning towards my own functions because they make it easy to see the generated speech output to a file, which is handy for saving reference audio that I can play on my phone and for making replays almost instant. That could also be useful for pre-generating the next paragraph to make it flow more smoothly. Still, it was interesting making something that is compatible with existing protocols and libraries.

      Posting it in case anyone else wants to use it as a starting point. The repository also contains the starting point for an Emacspeak-compatible speech server. See See speechd-ai/README.org for more details.

      https://codeberg.org/sachac/speechd-ai

      You can e-mail me at sacha@sachachua.com.

    14. 🔗 r/Leeds Road closed by Wellington Place rss

      Does anyone know what happened here? There seems to be a car with a couple of windows smashed out and the police have closed off the road (see pics). Car has been there since about 11.30am and they cleared the builders out of the building site as well

      submitted by /u/watchitspaceman
      [link] [comments]

    15. 🔗 r/reverseengineering Debugging An Undebuggable App rss
    16. 🔗 r/Yorkshire Is there a clear footpath walk from whitby to Robinhoods Bay? rss

      Not been and years and considering a day out this weekend.

      submitted by /u/saltlampsandphotos
      [link] [comments]

    17. 🔗 r/reverseengineering Chip Uploading - Emulation Online rss
    18. 🔗 r/reverseengineering Archive of classic reverse engineering tutorials (Armadillo, ASProtect, Themida, SoftICE era) rss
    19. 🔗 r/reverseengineering GitHub - iss4cf0ng/Elfina: Elfina is a multi-architecture ELF loader supporting x86 and x86-64 binaries. rss
    20. 🔗 r/reverseengineering HellsUchecker: ClickFix to blockchain-backed backdoor rss
    21. 🔗 r/Leeds Budget friendly places to get fresh flowers? Thought about Leeds market? Thanks!💐 rss

      Not sure of prices these days..

      submitted by /u/Bright_Fill_4770
      [link] [comments]

    22. 🔗 r/reverseengineering Reverse Engineering Action's Cheap Fichero Labelprinter rss
    23. 🔗 r/LocalLLaMA I was backend lead at Manus. After building agents for 2 years, I stopped using function calling entirely. Here's what I use instead. rss

      English is not my first language. I wrote this in Chinese and translated it with AI help. The writing may have some AI flavor, but the design decisions, the production failures, and the thinking that distilled them into principles — those are mine.

      I was a backend lead at Manus before the Meta acquisition. I've spent the last 2 years building AI agents — first at Manus, then on my own open-source agent runtime (Pinix) and agent (agent- clip). Along the way I came to a conclusion that surprised me:

      A singlerun(command="...") tool with Unix-style commands outperforms a catalog of typed function calls.

      Here's what I learned.


      Why *nix

      Unix made a design decision 50 years ago: everything is a text stream. Programs don't exchange complex binary structures or share memory objects — they communicate through text pipes. Small tools each do one thing well, composed via | into powerful workflows. Programs describe themselves with --help, report success or failure with exit codes, and communicate errors through stderr.

      LLMs made an almost identical decision 50 years later: everything is tokens. They only understand text, only produce text. Their "thinking" is text, their "actions" are text, and the feedback they receive from the world must be text.

      These two decisions, made half a century apart from completely different starting points, converge on the same interface model. The text-based system Unix designed for human terminal operators — cat, grep, pipe, exit codes, man pages — isn't just "usable" by LLMs. It's a natural fit. When it comes to tool use, an LLM is essentially a terminal operator — one that's faster than any human and has already seen vast amounts of shell commands and CLI patterns in its training data.

      This is the core philosophy of the _nix Agent: _ don't invent a new tool interface. Take what Unix has proven over 50 years and hand it directly to the LLM.*


      Why a single run

      The single-tool hypothesis

      Most agent frameworks give LLMs a catalog of independent tools:

      tools: [search_web, read_file, write_file, run_code, send_email, ...]

      Before each call, the LLM must make a tool selection — which one? What parameters? The more tools you add, the harder the selection, and accuracy drops. Cognitive load is spent on "which tool?" instead of "what do I need to accomplish?"

      My approach: onerun(command="...") tool, all capabilities exposed as CLI commands.

      run(command="cat notes.md") run(command="cat log.txt | grep ERROR | wc -l") run(command="see screenshot.png") run(command="memory search 'deployment issue'") run(command="clip sandbox bash 'python3 analyze.py'")

      The LLM still chooses which command to use, but this is fundamentally different from choosing among 15 tools with different schemas. Command selection is string composition within a unified namespace — function selection is context-switching between unrelated APIs.

      LLMs already speak CLI

      Why are CLI commands a better fit for LLMs than structured function calls?

      Because CLI is the densest tool-use pattern in LLM training data. Billions of lines on GitHub are full of:

      ```bash

      README install instructions

      pip install -r requirements.txt && python main.py

      CI/CD build scripts

      make build && make test && make deploy

      Stack Overflow solutions

      cat /var/log/syslog | grep "Out of memory" | tail -20 ```

      I don't need to teach the LLM how to use CLI — it already knows. This familiarity is probabilistic and model-dependent, but in practice it's remarkably reliable across mainstream models.

      Compare two approaches to the same task:

      ``` Task: Read a log file, count the error lines

      Function-calling approach (3 tool calls): 1. read_file(path="/var/log/app.log") → returns entire file 2. search_text(text=, pattern="ERROR") → returns matching lines 3. count_lines(text=) → returns number

      CLI approach (1 tool call): run(command="cat /var/log/app.log | grep ERROR | wc -l") → "42" ```

      One call replaces three. Not because of special optimization — but because Unix pipes natively support composition.

      Making pipes and chains work

      A single run isn't enough on its own. If run can only execute one command at a time, the LLM still needs multiple calls for composed tasks. So I make a chain parser (parseChain) in the command routing layer, supporting four Unix operators:

      | Pipe: stdout of previous command becomes stdin of next && And: execute next only if previous succeeded || Or: execute next only if previous failed ; Seq: execute next regardless of previous result

      With this mechanism, every tool call can be a complete workflow :

      ```bash

      One tool call: download → inspect

      curl -sL $URL -o data.csv && cat data.csv | head 5

      One tool call: read → filter → sort → top 10

      cat access.log | grep "500" | sort | head 10

      One tool call: try A, fall back to B

      cat config.yaml || echo "config not found, using defaults" ```

      N commands × 4 operators — the composition space grows dramatically. And to the LLM, it's just a string it already knows how to write.

      The command line is the LLM's native tool interface.


      Heuristic design: making CLI guide the agent

      Single-tool + CLI solves "what to use." But the agent still needs to know " how to use it." It can't Google. It can't ask a colleague. I use three progressive design techniques to make the CLI itself serve as the agent's navigation system.

      Technique 1: Progressive --help discovery

      A well-designed CLI tool doesn't require reading documentation — because --help tells you everything. I apply the same principle to the agent, structured as progressive disclosure : the agent doesn't need to load all documentation at once, but discovers details on-demand as it goes deeper.

      Level 0: Tool Description → command list injection

      The run tool's description is dynamically generated at the start of each conversation, listing all registered commands with one-line summaries:

      Available commands: cat — Read a text file. For images use 'see'. For binary use 'cat -b'. see — View an image (auto-attaches to vision) ls — List files in current topic write — Write file. Usage: write <path> [content] or stdin grep — Filter lines matching a pattern (supports -i, -v, -c) memory — Search or manage memory clip — Operate external environments (sandboxes, services) ...

      The agent knows what's available from turn one, but doesn't need every parameter of every command — that would waste context.

      Note: There's an open design question here: injecting the full command list vs. on-demand discovery. As commands grow, the list itself consumes context budget. I'm still exploring the right balance. Ideas welcome.

      Level 1:command (no args) → usage

      When the agent is interested in a command, it just calls it. No arguments? The command returns its own usage:

      ``` → run(command="memory") [error] memory: usage: memory search|recent|store|facts|forget

      → run(command="clip") clip list — list available clips clip — show clip details and commands clip [args...] — invoke a command clip pull [name] — pull file from clip to local clip push — push local file to clip ```

      Now the agent knows memory has five subcommands and clip supports list/pull/push. One call, no noise.

      Level 2:command subcommand (missing args) → specific parameters

      The agent decides to use memory search but isn't sure about the format? It drills down:

      ``` → run(command="memory search") [error] memory: usage: memory search [-t topic_id] [-k keyword]

      → run(command="clip sandbox") Clip: sandbox Commands: clip sandbox bash <script> clip sandbox read clip sandbox write File transfer: clip sandbox pull [local-name] clip sandbox push ```

      Progressive disclosure: overview (injected) → usage (explored) → parameters (drilled down). The agent discovers on-demand, each level providing just enough information for the next step.

      This is fundamentally different from stuffing 3,000 words of tool documentation into the system prompt. Most of that information is irrelevant most of the time — pure context waste. Progressive help lets the agent decide when it needs more.

      This also imposes a requirement on command design: every command and subcommand must have complete help output. It's not just for humans — it's for the agent. A good help message means one-shot success. A missing one means a blind guess.

      Technique 2: Error messages as navigation

      Agents will make mistakes. The key isn't preventing errors — it's making every error point to the right direction.

      Traditional CLI errors are designed for humans who can Google. Agents can't Google. So I require every error to contain both "what went wrong" and "what to do instead":

      ``` Traditional CLI: $ cat photo.png cat: binary file (standard output) → Human Googles "how to view image in terminal"

      My design: [error] cat: binary image file (182KB). Use: see photo.png → Agent calls see directly, one-step correction ```

      More examples:

      ``` [error] unknown command: foo Available: cat, ls, see, write, grep, memory, clip, ... → Agent immediately knows what commands exist

      [error] not an image file: data.csv (use cat to read text files) → Agent switches from see to cat

      [error] clip "sandbox" not found. Use 'clip list' to see available clips → Agent knows to list clips first ```

      Technique 1 (help) solves "what can I do?" Technique 2 (errors) solves "what should I do instead?" Together, the agent's recovery cost is minimal — usually 1-2 steps to the right path.

      Real case: The cost of silent stderr

      For a while, my code silently dropped stderr when calling external sandboxes — whenever stdout was non-empty, stderr was discarded. The agent ran pip install pymupdf, got exit code 127. stderr contained bash: pip: command not found, but the agent couldn't see it. It only knew "it failed," not "why" — and proceeded to blindly guess 10 different package managers:

      pip install → 127 (doesn't exist) python3 -m pip → 1 (module not found) uv pip install → 1 (wrong usage) pip3 install → 127 sudo apt install → 127 ... 5 more attempts ... uv run --with pymupdf python3 script.py → 0 ✓ (10th try)

      10 calls, ~5 seconds of inference each. If stderr had been visible the first time, one call would have been enough.

      stderr is the information agents need most, precisely when commands fail. Never drop it.

      Technique 3: Consistent output format

      The first two techniques handle discovery and correction. The third lets the agent get better at using the system over time.

      I append consistent metadata to every tool result:

      file1.txt file2.txt dir1/ [exit:0 | 12ms]

      The LLM extracts two signals:

      Exit codes (Unix convention, LLMs already know these):

      • exit:0 — success
      • exit:1 — general error
      • exit:127 — command not found

      Duration (cost awareness):

      • 12ms — cheap, call freely
      • 3.2s — moderate
      • 45s — expensive, use sparingly

      After seeing [exit:N | Xs] dozens of times in a conversation, the agent internalizes the pattern. It starts anticipating — seeing exit:1 means check the error, seeing long duration means reduce calls.

      Consistent output format makes the agent smarter over time. Inconsistency makes every call feel like the first.

      The three techniques form a progression:

      --help → "What can I do?" → Proactive discovery Error Msg → "What should I do?" → Reactive correction Output Fmt → "How did it go?" → Continuous learning


      Two-layer architecture: engineering the heuristic design

      The section above described how CLI guides agents at the semantic level. But to make it work in practice, there's an engineering problem: the raw output of a command and what the LLM needs to see are often very different things.

      Two hard constraints of LLMs

      Constraint A: The context window is finite and expensive. Every token costs money, attention, and inference speed. Stuffing a 10MB file into context doesn't just waste budget — it pushes earlier conversation out of the window. The agent "forgets."

      Constraint B: LLMs can only process text. Binary data produces high- entropy meaningless tokens through the tokenizer. It doesn't just waste context — it disrupts attention on surrounding valid tokens , degrading reasoning quality.

      These two constraints mean: raw command output can't go directly to the LLM — it needs a presentation layer for processing. But that processing can't affect command execution logic — or pipes break. Hence, two layers.

      Execution layer vs. presentation layer

      ┌─────────────────────────────────────────────┐ │ Layer 2: LLM Presentation Layer │ ← Designed for LLM constraints │ Binary guard | Truncation+overflow | Meta │ ├─────────────────────────────────────────────┤ │ Layer 1: Unix Execution Layer │ ← Pure Unix semantics │ Command routing | pipe | chain | exit code │ └─────────────────────────────────────────────┘

      When cat bigfile.txt | grep error | head 10 executes:

      Inside Layer 1: cat output → [500KB raw text] → grep input grep output → [matching lines] → head input head output → [first 10 lines]

      If you truncate cat's output in Layer 1 → grep only searches the first 200 lines, producing incomplete results. If you add [exit:0] in Layer 1 → it flows into grep as data, becoming a search target.

      So Layer 1 must remain raw, lossless, metadata-free. Processing only happens in Layer 2 — after the pipe chain completes and the final result is ready to return to the LLM.

      Layer 1 serves Unix semantics. Layer 2 serves LLM cognition. The separation isn't a design preference — it's a logical necessity.

      Layer 2's four mechanisms

      Mechanism A: Binary Guard (addressing Constraint B)

      Before returning anything to the LLM, check if it's text:

      ``` Null byte detected → binary UTF-8 validation failed → binary Control character ratio > 10% → binary

      If image: [error] binary image (182KB). Use: see photo.png If other: [error] binary file (1.2MB). Use: cat -b file.bin ```

      The LLM never receives data it can't process.

      Mechanism B: Overflow Mode (addressing Constraint A)

      ``` Output > 200 lines or > 50KB? → Truncate to first 200 lines (rune-safe, won't split UTF-8) → Write full output to /tmp/cmd-output/cmd-{n}.txt → Return to LLM:

      [first 200 lines] --- output truncated (5000 lines, 245.3KB) --- Full output: /tmp/cmd-output/cmd-3.txt Explore: cat /tmp/cmd-output/cmd-3.txt | grep <pattern> cat /tmp/cmd-output/cmd-3.txt | tail 100 [exit:0 | 1.2s]
      

      ```

      Key insight: the LLM already knows how to use grep, head, tail to navigate files. Overflow mode transforms "large data exploration" into a skill the LLM already has.

      Mechanism C: Metadata Footer

      actual output here [exit:0 | 1.2s]

      Exit code + duration, appended as the last line of Layer 2. Gives the agent signals for success/failure and cost awareness, without polluting Layer 1's pipe data.

      Mechanism D: stderr Attachment

      ``` When command fails with stderr: output + "\n[stderr] " + stderr

      Ensures the agent can see why something failed, preventing blind retries. ```


      Lessons learned: stories from production

      Story 1: A PNG that caused 20 iterations of thrashing

      A user uploaded an architecture diagram. The agent read it with cat, receiving 182KB of raw PNG bytes. The LLM's tokenizer turned these bytes into thousands of meaningless tokens crammed into the context. The LLM couldn't make sense of it and started trying different read approaches — cat -f, cat --format, cat --type image — each time receiving the same garbage. After 20 iterations, the process was force-terminated.

      Root cause: cat had no binary detection, Layer 2 had no guard. Fix: isBinary() guard + error guidance Use: see photo.png. Lesson: The tool result is the agent's eyes. Return garbage = agent goes blind.

      Story 2: Silent stderr and 10 blind retries

      The agent needed to read a PDF. It tried pip install pymupdf, got exit code 127. stderr contained bash: pip: command not found, but the code dropped it — because there was some stdout output, and the logic was "if stdout exists, ignore stderr."

      The agent only knew "it failed," not "why." What followed was a long trial- and-error:

      pip install → 127 (doesn't exist) python3 -m pip → 1 (module not found) uv pip install → 1 (wrong usage) pip3 install → 127 sudo apt install → 127 ... 5 more attempts ... uv run --with pymupdf python3 script.py → 0 ✓

      10 calls, ~5 seconds of inference each. If stderr had been visible the first time, one call would have sufficed.

      Root cause: InvokeClip silently dropped stderr when stdout was non- empty. Fix: Always attach stderr on failure. Lesson: stderr is the information agents need most, precisely when commands fail.

      Story 3: The value of overflow mode

      The agent analyzed a 5,000-line log file. Without truncation, the full text (~200KB) was stuffed into context. The LLM's attention was overwhelmed, response quality dropped sharply, and earlier conversation was pushed out of the context window.

      With overflow mode:

      ``` [first 200 lines of log content]

      --- output truncated (5000 lines, 198.5KB) --- Full output: /tmp/cmd-output/cmd-3.txt Explore: cat /tmp/cmd-output/cmd-3.txt | grep cat /tmp/cmd-output/cmd-3.txt | tail 100 [exit:0 | 45ms] ```

      The agent saw the first 200 lines, understood the file structure, then used grep to pinpoint the issue — 3 calls total, under 2KB of context.

      Lesson: Giving the agent a "map" is far more effective than giving it the entire territory.


      Boundaries and limitations

      CLI isn't a silver bullet. Typed APIs may be the better choice in these scenarios:

      • Strongly-typed interactions : Database queries, GraphQL APIs, and other cases requiring structured input/output. Schema validation is more reliable than string parsing.
      • High-security requirements : CLI's string concatenation carries inherent injection risks. In untrusted-input scenarios, typed parameters are safer. agent-clip mitigates this through sandbox isolation.
      • Native multimodal : Pure audio/video processing and other binary-stream scenarios where CLI's text pipe is a bottleneck.

      Additionally, "no iteration limit" doesn't mean "no safety boundaries." Safety is ensured by external mechanisms:

      • Sandbox isolation : Commands execute inside BoxLite containers, no escape possible
      • API budgets : LLM calls have account-level spending caps
      • User cancellation : Frontend provides cancel buttons, backend supports graceful shutdown

      Hand Unix philosophy to the execution layer, hand LLM's cognitive constraints to the presentation layer, and use help, error messages, and output format as three progressive heuristic navigation techniques.

      CLI is all agents need.


      Source code (Go): github.com/epiral/agent- clip

      Core files: internal/tools.go (command routing), internal/chain.go (pipes), internal/loop.go (two-layer agentic loop), internal/fs.go (binary guard), internal/clip.go (stderr handling), internal/browser.go (vision auto-attach), internal/memory.go (semantic memory).

      Happy to discuss — especially if you've tried similar approaches or found cases where CLI breaks down. The command discovery problem (how much to inject vs. let the agent discover) is something I'm still actively exploring.

      submitted by /u/MorroHsu
      [link] [comments]

    24. 🔗 r/york Community Eid dinner in York? rss

      Hi all! I was wondering if anyone was aware if there will be a community Eid dinner in York that's open not non-muslims?

      submitted by /u/Livid-Trade-3907
      [link] [comments]

    25. 🔗 r/reverseengineering runtime jvm analysis tool i made rss
    26. 🔗 Rust Blog Announcing rustup 1.29.0 rss

      The rustup team is happy to announce the release of rustup version 1.29.0.

      Rustup is the recommended tool to install Rust, a programming language that empowers everyone to build reliable and efficient software.

      What's new in rustup 1.29.0

      Following the footsteps of many package managers in the pursuit of better toolchain installation performance, the headline of this release is that rustup has been enabled to download components concurrently and unpack during downloads in operations such as rustup update or rustup toolchain and to concurrently check for updates in rustup check, thanks to a GSoC 2025 project. This is by no means a trivial change so a long tail of issues might occur, please report them if you have found any!

      Furthermore, rustup now officially supports the following host platforms:

      • sparcv9-sun-solaris
      • x86_64-pc-solaris

      Also, rustup will start automatically inserting the right $PATH entries during rustup-init for the following shells, in addition to those already supported:

      • tcsh
      • xonsh

      This release also comes with other quality-of-life improvements, to name a few:

      • When running rust-analyzer via a proxy, rustup will consider the rust-analyzer binary from PATH when the rustup-managed one is not found.

        • This should be particularly useful if you would like to bring your own rust-analyzer binary, e.g. if you use Neovim, Helix, etc. or are developing rust-analyzer itself.
        • Empty environment variables are now treated as unset. This should help with resetting configuration values to default when an override is present.
      • rustup check will use different exit codes based on whether new updates have been found: it will exit with 100 on any updates or 0 for no updates.

      Furthermore, @FranciscoTGouveia has joined the team. He has shown his talent, enthusiasm and commitment to the project since the first interactions with rustup and has played a significant role in bring more concurrency to it, so we are thrilled to have him on board and are actively looking forward to what we can achieve together.

      Further details are available in the changelog!

      How to update

      If you have a previous version of rustup installed, getting the new one is as easy as stopping any programs which may be using rustup (e.g. closing your IDE) and running:

      $ rustup self update
      

      Rustup will also automatically update itself at the end of a normal toolchain update:

      $ rustup update
      

      If you don't have it already, you can get rustup from the appropriate page on our website.

      Rustup's documentation is also available in the rustup book.

      Caveats

      Rustup releases can come with problems not caused by rustup itself but just due to having a new release.

      In particular, anti-malware scanners might block rustup or stop it from creating or copying files, especially when installing rust-docs which contains many small files.

      Issues like this should be automatically resolved in a few weeks when the anti-malware scanners are updated to be aware of the new rustup release.

      Thanks

      Thanks again to all the contributors who made this rustup release possible!

    27. 🔗 Console.dev newsletter Ki Editor rss

      Description: Structural code editor.

      What we like: Acts on the AST so code manipulations happen within the true language syntax e.g. selecting the whole control statement. This enables AST native editing, selection, navigation, find & replace. Has a built in LSP and file explorer. Themes and syntax highlighting powered by Tree-sitter.

      What we dislike: Might take some getting used to - it has a VS Code extension if you prefer a GUI.

    28. 🔗 Console.dev newsletter Agent Safehouse rss

      Description: macOS native AI sandboxing.

      What we like: Denies access outside of your project directory using macOS native, kernel-level sandboxes. Has safe defaults for access to things like core system tools, network access, Git, etc. Security sensitive actions require opt-in e.g. clipboard, docker, shell access.

      What we dislike: macOS only.

  2. March 11, 2026
    1. 🔗 IDA Plugin Updates IDA Plugin Updates on 2026-03-11 rss

      IDA Plugin Updates on 2026-03-11

      New Releases:

      Activity:

    2. 🔗 MetaBrainz Schema change release: May 11, 2026 rss

      MusicBrainz is announcing a new schema change release set for May 11, 2026. Schema-wise, this release will be very light. At the same time, we'll be requiring some major dependency upgrades to Perl, PostgreSQL, and Node.js. We'll also be switching from Redis to Valkey in production. See below for more information.

      The only breaking schema change is MBS-14252. It drops columns which are unused even in MusicBrainz Server, so should have little impact.

      Here is the complete list of scheduled tickets:

      Database schema

      The following tickets change the database schema in some way.

      • MBS-6551: Database does not prevent a release from having duplicate label/catno pairs. This ticket involves replacing an index on the release_label table for additional data sanity. We'll introduce a unique index on (release, label, catalog_number) (with NULL values treated as equal). This should have no impact on downstream users.
      • MBS-14092: Add support for series of series. This will allow connecting series that are related to each other in some way; for example, a series of series that have been honored with the same award, like the Golden Globe Award for Best Podcast. This involves adding a new series_series view, and replacing the allowed_series_entity_type constraint on the series_type table. It doesn't modify or remove any other parts of the schema.
      • MBS-14252: Drop "source" column from iswc and isrc tables. As the title says, this drops the unused isrc.source and iswc.source columns from the database. Unless you've specifically referenced these columns in a query, this change should have no impact on you.

      Server dependencies

      • MBS-14243: Upgrade the required version of Perl to 5.42. This is required as Perl 5.38 will no longer receive critical security fixes past July 2026.
      • MBS-14246 : Upgrade the required version of PostgreSQL to 18. We last upgraded to PostgreSQL v16 two years ago, and would like to take advantage of the many performance advancements in PostgreSQL since then.

      Note that the PGDG maintains an official APT repository for Debian and Ubuntu. PostgreSQL 18.3 is also available on Amazon RDS.

      An upgrade script will be available for MusicBrainz Docker users with instructions provided at release time.

      • MBS-14244: Upgrade the required version of Node.js to 24. This is a straightforward upgrade to the latest LTS release, as Node.js v20 will soon be end-of-life.
      • MBS-14245: Switch from Redis to Valkey. Valkey is compatible with Redis OSS 7.2, and should be a drop-in replacement. There's no reason to expect that Redis would stop working either. (The commands that MusicBrainz Server uses are very basic, and work even in Redis v3.)

      Search server

      • SEARCH-756: Trigger reindex from dbmirror2 replication data. This drops the dependency on RabbitMQ and pg_amqp for live updating the Solr search indexes, and triggers the reindex process directly from PostgreSQL instead, by relying on the change data we already generate there for replication packets. If you run a local search indexer, this will simplify the setup/dependencies needed. Database-wise, it will require replacing triggers and creating a new "sir" schema.

      We’ll post upgrade instructions for standalone/mirror servers on the day of the release. If you have any questions, feel free to comment below or on the relevant above-linked tickets.

    3. 🔗 r/Yorkshire The village waging a very British war on dog waste rss

      The village waging a very British war on dog waste | Where rolling fields meet towering trees, a hawthorn-lined bridleway on the outskirts of a West Yorkshire town is about as idyllic as a suburban snicket gets. But amid the sound of birdsong and the faint rumble of the nearby M62, anger is also in the air. Warning notices punctuate the path, strewn with capital letters and red text, imploring dog owners to take home their pet's waste. Recently, volunteers collected 350 dog poo bags within a stretch of slightly more than a quarter of a mile (0.4km). Pushed into hedgerows, hung from tree branches and flung into banks along the route, the litter has been piling up on this local route in Scholes, near Cleckheaton. Clean-up volunteers who have had enough have launched their own protest; erecting signs and leaving dozens of the weighty filled bags they collect displayed on the path to make a quiet - but squelchy - statement. submitted by /u/coffeewalnut08
      [link] [comments]
      ---|---

    4. 🔗 r/reverseengineering Practical Type Inference: High-Throughput Recovery of Real-World Structures and Function Signatures rss
    5. 🔗 r/reverseengineering FlapOS: an open source alternative firmware for "flapit" devices rss
    6. 🔗 r/Leeds Antique Leeds prints - shops to sell them through? rss

      Hey all. I've a whole load of antique framed prints, all hand coloured views of Leeds. They were purchased from an antique dealer some years ago, authenticated etc, but the shops since retired and closed up. We've inherited these from a recently deceased relative. In the collection there's maybe 20 to 30 or more framed print of views of late 1800's and industrial revolution Leeds. These are the kind of prints that'll take years to sell individually, but would be good stock for a boutique type shop in Leeds as a job lot......but I'm based in Notts so can't wonder the streets and see who'd be interested.

      Are there any shops or dealer that you can think of that may want to buy the whole collection?

      submitted by /u/KIAA0319
      [link] [comments]

    7. 🔗 r/LocalLLaMA Nvidia Will Spend $26 Billion to Build Open-Weight AI Models, Filings Show rss

      Nvidia Will Spend $26 Billion to Build Open-Weight AI Models, Filings Show | submitted by /u/dan945
      [link] [comments]
      ---|---

    8. 🔗 r/Leeds Twister Lollies (sour) rss

      My friend’s son has had a Fruit Zinger Twister Lolly every day since release in Summer. They are fast running out. If anyone could help this special family and check the freezer section of any Leeds based Tescos, Sainsbury’s or One Stop, that would be very kind. Thank you.

      submitted by /u/m00stv
      [link] [comments]

    9. 🔗 r/Yorkshire New Leeds independent newspaper Start up rss

      https://leeds.ghost.io/welcome-to-leeds-new-paper/?ref=leeds-newsletter

      Please considering supporting this project so more independent news written about the wonderful city.

      I am not involved in the project but I thought people would appreciate knowing about it.

      submitted by /u/SaveCarbonSaveMoney
      [link] [comments]

    10. 🔗 r/Leeds Found a gift card in roundhay park rss

      I was just on a walk in Roundhay Park when i found a gift card tucked into a life ring near the lake. I’ve now checked the balance and it has 20 quid on it. Anyone know what this might be about or if someone’s just “paid it forward”. It was bought on 13 december according to the gift receipt. Cheers

      submitted by /u/Falco1105
      [link] [comments]

    11. 🔗 r/LocalLLaMA llama.cpp on $500 MacBook Neo: Prompt: 7.8 t/s / Generation: 3.9 t/s on Qwen3.5 9B Q3_K_M rss

      llama.cpp on $500 MacBook Neo: Prompt: 7.8 t/s / Generation: 3.9 t/s on Qwen3.5 9B Q3_K_M | Just compiled llama.cpp on MacBook Neo with 8 Gb RAM and 9b Qwen 3.5 and it works (slowly, but anyway) Config used:

      Build - llama.cpp version: 8294 (76ea1c1c4) Machine - Model: MacBook Neo (Mac17,5) - Chip: Apple A18 Pro - CPU: 6 cores (2 performance + 4 efficiency) - GPU: Apple A18 Pro, 5 cores, Metal supported - Memory: 8 GB unified Model - Hugging Face repo: unsloth/Qwen3.5-9B-GGUF - GGUF file: models/Qwen3.5-9B-Q3_K_M.gguf - File size on disk: 4.4 GB Launch hyperparams ./build/bin/llama-cli \ -m models/Qwen3.5-9B-Q3_K_M.gguf \ --device MTL0 \ -ngl all \ -c 4096 \ -b 128 \ -ub 64 \ -ctk q4_0 \ -ctv q4_0 \ --reasoning on \ -t 4 \ -tb 6 \ -cnv
      

      UPD. I did some benchmarking – faster 5 tok/sec config for 9b model is here, and 10 tok/sec config for 4b model is here submitted by /u/Shir_man
      [link] [comments]
      ---|---

    12. 🔗 r/wiesbaden Mountainbike Shop in WI o. MZ rss

      Hi zusammen,

      ich bin aktuell auf der Suche nach einem guten Fahrradladen in Wiesbaden (oder auch Mainz) der top Mountainbikes (keine E-bikes) hat und wollte mal hier nach euren Empfehlungen fragen.

      Ich würde mein Geld lieber in einem kleineren, lokalen Shop ausgeben, bei dem man auch etwas Beratung bekommt und der vernünftige Bikes verkauft als unbedingt zu den großen Ketten auf der Mainzer Straße zu gehen.

      Hat jemand von euch gute Erfahrungen und einen Tipp für mich?

      Danke euch! 🙌

      submitted by /u/Exercise-Signal
      [link] [comments]

    13. 🔗 sacha chua :: living an awesome life French pronunciation assessment with Azure and my Emacs setup rss

      I'm working on learning French because it's fun and because I want to help my daughter with her French classes. Some sounds are particularly difficult because they don't exist in English, so they involve moving my tongue and my lips in ways that I'm not used to. I've been working on some sentences my tutor assigned me to help me with various sounds. I practice:

      • during the 45-minute virtual sessions with my tutor twice a week: I record this with OBS
      • recording on my phone when I find myself with some spare time away from my computer, like when I'm waiting for my daughter
      • recording on my computer with Emacs, subed-record, and ffmpeg
      • just out loud, like when I'm skating or doing the dishes

      I use WhisperX to get the transcript and word timing data with speaker identification (diarization). I use subed-record in Emacs to correct the transcript, cut/split/trim audio segments, and compile them

      I've been working on my processes for reviewing reference examples either from text-to-speech engines or my meeting audio, recording and reviewing my attempts, and comparing them.

      I want to be able to quickly listen to my best attempts from a session, compare different versions, and keep track of my progress over time. I also want to figure out how to practice pronunciation in between sessions with my tutor while minimizing the risk of reinforcing mistakes. To segment the audio for review, I use WhisperX to get the transcript and word timing data with speaker identification (diarization). I use subed-record in Emacs to correct the transcript, cut/split/trim audio segments, and compile them. I want to make this process even easier.

      Some of the sounds I'm working on are:

      • /​y/ as in mule and bu
      • /​ʁ/ as in trottoir: making it with less air, but without it feeling like an "h" instead; also, transitioning to /​y/ as in brume
      • /​œ/ as in cœur
      • distinguishing between roue and rue

      Microsoft Azure pronunciation assessment

      I've been thinking about how to practise more effectively in between my twice-weekly tutoring sessions. There's been a fair amount of research into computer-aided language training and pronunciation assessment, and I wonder how I can tweak my processes and interfaces to take advantage of what other people have learned. I think many apps use the confidence scores of Whisper-based speech recognition engines. Other services try to be more detailed. For example, Microsoft Azure offers a pronunciation assessment service that scores audio samples on:

      • accuracy: how closely phonemes match a native speaker's pronunciation
      • fluency: how closely it matches a native speaker's silent breaks between words
      • completeness: whether all the words were said
      • pronunciation: the total score
      • confidence

      Azure's syllable and phoneme analyses only work for the en-US locale, so I can't use them for fr-FR. That's okay, LLM pronunciation evaluation might work better for sentences than for words or phonemes anyway.1

      I'll share my results first, and then I'll describe my workflow.

      For example, here are some Azure scores for "Trois très grands trains traversent trois trop grandes rues", compared with a Kokoro TTS sample and its results when analyzing an audio segment of my teacher speaking (not included).

        Mar 3 Bad Kokoro Tutor
      Overall 94 44 97 93
      Accuracy 93 63 96 100
      Fluency 93 62 100 89
      Completeness 100 33 100 100
      Confidence 90 80 92 02

      I'm not entirely sure how useful these numbers will be. It does distinguish between my current attempts and intentionally bad pronunciation, but I'm not sure how much I can trust it yet, or how useful it will be for guiding my attempts in between tutoring sessions. I can imagine a workflow where I listen to a reference, record my attempt, and replay the reference and the recording while displaying the scores, a waveform, and a spectrogram.

      Let's look at another example of the scores. The tongue-twister "La mule sûre court plus vite que le loup fou." might offer a more useful comparison because I have a hard time with the u sound in "mule".

        Mar 3 with tutor Kokoro Tutor
      Overall 84 94 100 95
      Accuracy 96 95 100 100
      Fluency 75 92 100 93
      Completeness 100 100 100 100
      Confidence 84 85 89 87

      Phonemes

      What about trying to get the IPA and then doing some kind of comparison? I tried using Allosaurus and Montreal Forced Aligner for my audio samples for "La mule sûre court plus vite que le loup fou." I've also included IPA output from Kokoro TTS and espeak, which generate them from text, although I've removed the stress marks for easier comparison.

      Mar 3 naɛmis̪yʁkɔpluzitɑləkœnədufu
      with tutor (Allosaurus) lɛns̪ikɑlovitkɛlədufu
      with tutor (MFA) lamylsyʁkuʁplyvitkəlølufu
      Tutor lamilks̪yuəs̪ikətəlufu
      Kokoro TTS reference la myl syʁ kuʁ ply vit kə lə lu fu.
      Espeak TTS reference la mjul suɹə kɔt plʌs vaɪt kwɛ lə lup fu.

      The IPA for a sentence is hard to read and compare. Maybe I can get Allosaurus to do word breaks, or maybe I can try a different tool. The phonemes from Montreal Forced Aligner look like they might be a little more manageable; I can see that the main difference is that I said instead of . I can get timing data from MFA, so I might be able to use that to break it up into words. It would be nice to get confidence data, though, since I'm pretty sure that y isn't solid yet.

      Analyzing multiple tongue-twisters from a single session

      Here are some tongue-twister attempts from my March 6 session, annotated with the comments from my tutor. First, the list of tongue-twisters in this analysis.

      1. Maman peint un grand lapin blanc.
      2. Un enfant intelligent mange lentement.
      3. Le roi croit voir trois noix.
      4. Il est loin mais moins loin que ce coin.
      5. Le témoin voit le chemin loin.
      6. Moins de foin au loin ce matin.
      7. La laine beige sèche près du collège.
      8. La croquette sèche dans l’assiette.
      9. Elle mène son frère à l’hôtel.
      10. Le verre vert est très clair.
      11. Elle aimait manger et rêver.
      12. Le cœur seul pleure doucement.
      13. Le beurre fond dans le cœur chaud.
      14. Tu es sûr du futur ?
      15. Un mur dur bloque la rue.

      Then I can extract audio segments, transcribe the IPA, and send it to Azure for pronunciation analysis all in one go.

      (my-subed-record-analyze-file-with-azure "~/proj/french/analysis/virelangues/2026-03-06-raphael-script.vtt")
      
      File ID Comments All Acc Flu Comp Conf Phonemes
      01-01   85 92 78 100 88 lɒmɒaɒʁɔnlat̪ɒlɒ
      01-02 Mm hmm 88 86 86 100 87 mamakɔaɒkul̪ɒlatalɒ
      01-03 Ouais 68 73 69 67 86 lɒmɒaɒpzɒlɒ
      01-04 Ouais 81 78 89 83 86 ɒenɒmɒt̪ɒʁɔlɒt̪alɒŋ
      02-01 Ouais, c'est bien 92 93 90 100 90 ɒjnɒwsɒnɛnt̪eleʒɔnmomuʃlɔns̪mɒ
      03-01 Uh huh 98 100 97 100 88 jal̪aʁwaɒll̪awandwa
      04-01 Ouais, parfait 91 92 100 89 89 ijelwamenmwdoakesekwaŋ
      05-01 X: témoin 83 82 90 83 88 etemwawaəʃemɒdwaŋ
      05-02 Ouais 92 88 96 100 89 et̪timwawaeʃemɒlwa
      06-01 Mm hmm, parfait 89 93 94 86 89 wɒt̪iəwozwasənmɒtɒŋ
      07-01 X: près du collège 78 79 99 71 87 ɒmnteʁalol̪e
      07-02 X: près 85 86 85 86 88 ɒjteʁatkœl̪es
      07-03 Mm hmm 90 93 99 86 89 ɒl̪mnteadəkol̪en
      08-01 Ouais, c'est mieux 99 99 99 100 90 laokes̪estɒnlas̪iə
      08-02 Mm hmm 97 96 100 100 90 laokes̪est̪ɒvɒnləs̪iən
      09-01 Ouais, c'est bien 99 99 100 100 89 ɛnmɛnsɑnsʁɛajlot̪ɛn
      10-01 Mm hmm 87 88 99 83 87 nɛzaʁbɛʁejklə
      11-01 Mm hmm 100 100 100 100 89 elɒnmimaŋzeʁɒze
      12-01 X: doucement 81 86 81 80 87 ikaʁapjobil̪ədimɒ
      12-02 Ouais 82 83 76 100 86 ikɑsuntius̪əmɑn
      12-03 Ouais, c'est mieux 70 75 65 80 84 lɛkɑsødius̪əmən
      13-01 Ouais 85 85 85 86 84 lidəfɔdɑŋlikɑʁʃəl
      14-01 Mm hmm 97 96 98 100 87 kel̪eid̪jefit̪joəʁ
      15-01 X: rue 84 90 78 100 84 ɒwl̪midijəlɔʁlɒʁs̪u
      15-02 X: rue 84 87 88 83 83 ɒwmjd̪il̪ɔʁnɒku

      The play buttons even work in Org Mode because I use a custom link type for audio links.

      It would be useful to analyze one tongue-twister across multiple sessions to get a sense of my progress.

      Focusing on one word

      Here's a deep-dive on the word "mule" from my March 3 session. I exported the words using the WhisperX JSON. Both Azure speech recognition results and Allosaurus phonemes are all over the place when I try to run them on audio segments with individual words.

      (my-subed-record-extract-words "mule"  "/home/sacha/sync/recordings/processed/2026-03-03-raphael.json" "/home/sacha/proj/french/analysis/mule/index.vtt")
      

      I manually adjusted some timestamps, removed some segments, and added a reference sample from Wiktionary (source, public domain). Here's the WebVTT file with the directives: file:///home/sacha/proj/french/analysis/mule/index.vtt

      (my-subed-record-analyze-file-with-azure "~/proj/french/analysis/mule/index.vtt")
      
      File ID Comments WhisperX All Acc Flu Comp Conf Phonemes
      01 Wiktionary - France   97 95 100 100 77 mel
      02   92 80 68 100 100 78 mi
      03   91 80 68 100 100 79 nol̪ɒj
      04 Bit of a y 90 95 93 100 100 88 nyjɔl̪ɒ
      05 Bit of a y 89 88 80 100 100 83 j
      06   89 88 80 100 100 82 muə
      07 Bit of a y 88 10 50 0 0 78 niə
      08   88 76 60 100 100 81 mio
      09 Bit of a y 84 91 86 100 100 84 mj
      10 Bit of a y 84 11 56 0 0 77 nj
      11 Bit of a y 83 6 31 0 0 79 jn
      12   83 80 68 100 100 80 mijə
      13   82 0 0 0 0 88 ɒ
      14 Got a small "oui" 80 1 5 0 0 84 ija
      15   78 4 21 0 0 57 mia
      16   78 10 53 0 0 80 miə
      17   75 10 52 0 0 77 we
      18 Got a "non!" 75 8 44 0 0 85 mjən
      19   72 79 66 100 100 78
      20 Bit of a y 72 79 66 100 100 79 miə

      At the word level, the Azure pronunciation scores are all over the place, and so are the Allosaurus phonemes. For now, I think it might be better to tweak my interface so that I can more easily refer to the samples (text to speech or recordings extracted from my tutoring session) and compare new recordings, maybe with waveforms, spectrograms, and other plots.

      I like the progress I've made on a workflow for extracting and evaluating sentences or words. I'm looking forward to seeing what I can do once I understand a bit more of the research. Let me describe the workflow I have so far.

      Analyzing one tongue-twister across multiple sessions

      I can also compile different versions of one tongue-twister into one file to get a sense of my progress.

      (my-subed-record-analyze-file-with-azure
       (my-subed-record-collect-matching-subtitles
       "Le roi croit voir trois noix"
       '(("~/sync/recordings/processed/2026-02-20-raphael-tongue-twisters.vtt" . "Feb 20")
         ("~/sync/recordings/processed/2026-02-22-virelangues-single.vtt" . "Feb 22")
         ("~/proj/french/recordings/2026-02-26-virelangues-script.vtt" . "Feb 26")
         ("~/proj/french/recordings/2026-02-27-virelangues-script.vtt" . "Feb 27")
         ("~/proj/french/recordings/2026-03-03-virelangues.vtt" . "Mar 3")
         ("~/proj/french/analysis/virelangues/2026-03-06-raphael-script.vtt" . "Mar 6"))
       "~/proj/french/analysis/virelangues/Le-roi-croit-voir-trois-noix.vtt"
       nil 'my-subed-simplify))
      
      File ID Comments All Acc Flu Comp Conf Phonemes
      01 Feb 20 76 78 73 83 85 ɛmiwəkɔwɑtwɑsmɑs
      02 Feb 22 97 98 97 100 88 jusɒl̪vwatlɒnwa
      03 Feb 26 97 100 96 100 89 əkwɒwɑtwanwɑ
      04 Feb 27 96 95 97 100 87 jɛʀwɑl̪ɛvwɑtwɛnwɑ
      05 Mar 3 94 96 92 100 88 juwakwɒl̪dwɑwanwɑ
      06 Mar 6: : Uh huh 98 100 97 100 88 jal̪aʁwaɒll̪awandwa
      Emacs Lisp code for collecting matching subtitles
      (defun my-subed-record-collect-matching-subtitles (text files output-file &optional match-fn transform-fn)
        "Find subtitles that match TEXT in FILES and write to OUTPUT-FILE.
      FILES can be a list of filenames or a list of (FILE . NOTE) pairs.
      By default, TEXT is an approximate match based on
      `subed-word-data-compare-normalized-string-distance'.
      If TRANSFORM-FN is specified, use that on SUBTITLE-TEXT before comparing.
      If MATCH-FN is specified, use that to match instead. It will be called
      with the arguments input-text and subtitle-text.
      Return OUTPUT-FILE."
        (subed-create-file
         output-file
         (seq-mapcat
          (lambda (o)
            (let ((media-file (subed-guess-media-file nil (expand-file-name (if (consp o) (car o) o)))))
              (seq-keep
               (lambda (sub)
                 (when transform-fn
                   (setf (elt sub 3) (funcall transform-fn (elt sub 3))))
                 (when (funcall (or match-fn 'subed-word-data-compare-normalized-string-distance) text (elt sub 3))
                   ;; Add audio note if missing
                   (unless (or (subed-record-get-directive "#+AUDIO" (elt sub 4))
                               (null media-file))
                     (setf (elt sub 4)
                           (subed-record-set-directive "#+AUDIO" media-file (or (elt sub 4) ""))))
                   ;; Prepend note if specified
                   (when (and (consp o) (cdr o))
                     (let ((current-note (subed-record-get-directive "#+NOTE" (elt sub 4))))
                       (setf (elt sub 4)
                             (subed-record-set-directive
                              "#+NOTE"
                              (if current-note
                                  (concat (cdr o) ": " current-note)
                                (cdr o))
                              (or (elt sub 4) "")))))
                   sub))
               (subed-parse-file (car o)))))
          files)
         t)
        output-file)
      

      Workflow

      Extracting parts of the recording

      I use OBS to record both my microphone and the tutor's voice. I use WhisperX to transcribe my recording with speaker diarization.

      Shell script
      #!/bin/zsh
      WHISPER_ARGS=(${(z)WHISPER_FLAGS})
      MAX_LINE_WIDTH="${MAX_LINE_WIDTH:-50}"
      MODEL="${MODEL:-large-v2}"
      for FILE in "$@"; do
          text="${FILE%.*}.txt"
          if [  -f "$text" ]; then
             echo "Skipping $FILE as it's already been transcribed."
          else
              ~/vendor/whisperx/.venv/bin/whisperx --model "$MODEL" --diarize --hf_token $HUGGING_FACE_API_KEY --language fr --align_model WAV2VEC2_ASR_LARGE_LV60K_960H --compute_type int8 --print_progress True --max_line_width $MAX_LINE_WIDTH --segment_resolution chunk --max_line_count 1 --initial_prompt "Emacs et Org Mode sont d'excellents outils. J'utilise Org-roam pour prendre des notes. Today I am recording a braindump about technical setups. C'est vraiment utile pour la productivité." "$FILE" $WHISPER_FLAGS
              rm -f "${FILE%.*}.srt"
          fi
      done
      

      I can review the VTT manually, but it's also useful to be able to quickly extract different attempts at the phrases or words. I added subed-record-extract-all-approximately-matching-phrases to subed-record.el so that I can generate a starting point with something like this:

      (subed-record-extract-all-approximately-matching-phrases
         phrases
         "/home/sacha/sync/recordings/processed/2026-03-06-raphael.json"
         "/home/sacha/proj/french/analysis/virelangues/2026-03-06-raphael-script.vtt")
      

      Ideas:

      • Group by speaker ID to make it possible to extract phrases even with interstitial corrections.
      • Add an optional parameter that lets me append to an existing file.

      Here's a copy of its output: file:///home/sacha/proj/french/analysis/virelangues/2026-03-06-raphael-script-original.vtt

      2026-03-06_13-49-52.png
      Figure 1: Screenshot with subed-waveform showing the waveforms for each segment

      Based on the waveforms, I can see that some timestamps need to be adjusted, and some phrases may need to be duplicated, trimmed, or split. This is the file that I ended up with.

      file:///home/sacha/proj/french/analysis/virelangues/2026-03-06-raphael-script.vtt

      Sometimes I want to do a deep dive on a specific word. Here's another function that uses the WhisperX JSON data to extract just single words, like in my analysis of "mule".

      Extracting words
      (defun my-subed-record-extract-words (word word-data-file output-file)
        (let ((media-file (subed-guess-media-file nil word-data-file)))
          (subed-create-file
           output-file
           (mapcar (lambda (o)
                     (list nil
                           (alist-get 'start o)
                           (alist-get 'end o)
                           (alist-get 'text o)
                           (format "#+AUDIO: %s\n#+WHISPER_SCORE: %d\n#+SPEAKER: %s"
                                   media-file
                                   (* (alist-get 'score o) 100)
                                   (or (alist-get 'speaker o) ""))))
                   (sort
                    (seq-filter
                     (lambda (o)
                       (subed-word-data-compare-normalized-string-distance
                         word
                         (alist-get 'text o)))
                     (subed-word-data-parse-file
                      word-data-file))
                    :key (lambda (o) (alist-get 'score o))
                    :reverse t))
           t)))
      

      Azure pronunciation assessment

      I can use the following code from any subtitle of 30 seconds or less to automatically extract the audio for that subtitle and add a comment with the scores from Microsoft Azure pronunciation assessment. It needs an API key and region. I think the free tier includes 5 hours of speech each month, with additional hours priced at USD 0.66 per hour for short files less than 30 seconds (billed in 1-second increments). There's another API that can handle longer segments for USD 1.32 per hour (also included in the 5 hours free), but I'll probably need a Python or NodeJS program. Since I'm working with words and short sentences for now, I can use the REST API.

      (defvar my-subed-record-azure-assessment nil)
      (defvar my-subed-record-azure-assess-pronunciation-lang "fr-FR")
      ;; (my-subed-record-azure-assess-pronunciation "~/proj/french/analysis/mule/ref-france-sample.wav" "mule")
      (defun my-azure-assess-pronunciation (audio-file reference-text)
        "Send AUDIO-FILE to Azure for pronunciation assessment against REFERENCE-TEXT.
      Needs the AZURE_SPEECH_REGION and AZURE_SPEECH_KEY environment variables."
        (interactive (list (read-file-name "Audio: ")
                           (read-string "Text: ")))
        (let* ((url (format "https://%s.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=%s&format=detailed"
                            (getenv "AZURE_SPEECH_REGION")
                            (url-hexify-string my-subed-record-azure-assess-pronunciation-lang)))
               ;; 1. Create the Configuration JSON
               (config-json (json-encode `(("ReferenceText" . ,reference-text)
                                           ("GradingSystem" . "HundredMark")
                                           ("Granularity"   . "Phoneme")
                                           ("Dimension"     . "Comprehensive"))))
               ;; 2. Base64 encode the config
               (config-base64 (base64-encode-string (encode-coding-string config-json 'utf-8) t))
               ;; 3. Prepare headers
               (headers `(("Ocp-Apim-Subscription-Key" . ,(getenv "AZURE_SPEECH_KEY"))
                          ("Pronunciation-Assessment"  . ,config-base64)
                          ("Content-Type"              . "audio/wav; codecs=audio/pcm; samplerate=16000")))
               (data (plz 'post url
                       :headers headers
                       :body `(file ,audio-file)
                       :as #'json-read)))
          (when (called-interactively-p 'any)
            (kill-new (my-subed-record-azure-format-assessment data))
            (message "%s" (my-subed-record-azure-format-assessment data)))
          data))
      
      (defun my-subed-record-azure-assess-pronunciation (&optional beg end)
        "Assess the pronunciation of the current cue."
        (interactive (if (region-active-p)
                         (list (region-beginning)
                               (region-end))
                       (list (subed-subtitle-start-pos)
                             (save-excursion
                               (subed-jump-to-subtitle-end)
                               (point)))))
        (subed-for-each-subtitle beg end t
          (let* ((temp-file (make-temp-file "subed-record-azure" nil ".wav"))
                 (text (my-subed-simplify (subed-subtitle-text)))
                 data
                 score)
            (subed-record-extract-audio-for-current-subtitle-to-file temp-file)
            (setq data
                  (my-azure-assess-pronunciation
                   temp-file
                   text))
            (when data
              (subed-record-set-directive
               "#+SCORE"
               (my-subed-record-azure-format-assessment data))
              (setq my-subed-record-azure-assessment data))
            (delete-file temp-file)
            data)))
      
      (defun my-subed-record-azure-format-assessment (&optional data)
        (setq data (or data my-subed-record-azure-assessment))
        (let-alist (car (alist-get 'NBest data))
          (format "%d; A: %d, F: %d, C: %d, Conf: %d"
                  (or .PronunciationAssessment.PronScore .PronScore)
                  (or .PronunciationAssessment.AccuracyScore .AccuracyScore)
                  (or .PronunciationAssessment.FluencyScore .FluencyScore)
                  (or .PronunciationAssessment.CompletenessScore .CompletenessScore)
                  (* 100.0 (or .PronunciationAssessment.Confidence
                               .Confidence)))))
      

      Phonemes with Allosaurus, Espeak, or Kokoro FastAPI

      (defvar my-subed-record-allosaurus-command '("/home/sacha/proj/french/.venv/bin/python3" "-m" "allosaurus.run" "--lang" "fra" "-i"))
      (defun my-subed-record-allosaurus-phonemes (&optional beg end)
        (interactive (if (region-active-p)
                         (list (region-beginning)
                               (region-end))
                       (list (subed-subtitle-start-pos)
                             (save-excursion
                               (subed-jump-to-subtitle-end)
                               (point)))))
        (subed-for-each-subtitle beg end t
          (let* ((temp-file (make-temp-file "subed-record-allosaurus" nil ".wav")))
            (subed-record-extract-audio-for-current-subtitle-to-file temp-file)
            (subed-record-set-directive
             "#+PHONEMES"
             (with-temp-buffer
               (apply #'call-process (car my-subed-record-allosaurus-command)
                      nil t nil
                      (append
                       (cdr my-subed-record-allosaurus-command)
                       (list temp-file)))
               (delete-file temp-file)
               (replace-regexp-in-string " " "" (buffer-string)))))))
      
      (defun my-lang-espeak-ng-phonemes (text)
        (interactive "MText: ")
        (let ((data
               (with-temp-buffer
                 (call-process "espeak" nil t nil "-q" "--ipa" text)
                 (string-trim (buffer-string)))))
          (when (called-interactively-p 'any)
            (kill-new data)
            (message "%s" data))
          text))
      
      (defun my-french-kokoro-fastapi-phonemes (s)
        (interactive "MText: ")
        (my-kokoro-fastapi-ensure)
        (let ((data (alist-get 'phonemes
                               (plz 'post "http://localhost:8880/dev/phonemize"
                                 :headers '(("Content-Type" . "application/json"))
                                 :body (json-encode `((text . ,s)
                                                      (language . "fr-fr")))
                                 :as #'json-read))))
          (when (called-interactively-p 'any)
            (kill-new data)
            (message "%s" data))
          data))
      

      Splitting into segments and making a table

      (defun my-subed-record-make-groups (subtitles)
        "Come up with a good ID for attempts, grouping by cue text."
        (let* ((group-num 0)
               (sub-num 0)
               (grouped ; ((text . (start start start)) ...)
                (mapcar
                 (lambda (o)
                   (cons (car o)
                         (mapcar
                          (lambda (sub) (elt sub 1))
                          (cdr o))))
                 (seq-group-by
                  (lambda (o) (elt o 3))
                  (seq-remove (lambda (o)
                                (and (elt o 4)
                                     (string-match "#\\+SKIP" (elt o 4)))) subtitles))))
               (one-group (= (length grouped) 1)))
          (seq-mapcat
           (lambda (o)
             (setq group-num (1+ group-num))
             (unless one-group (setq sub-num 0))
             (seq-map
              (lambda (sub-start)
                (setq sub-num (1+ sub-num))
                (cons sub-start (if one-group
                                    (format "%02d" sub-num)
                                  (format "%02d-%02d" group-num sub-num))))
              (cdr o)))
           grouped)))
      
      (defun my-subed-record-analyze-file-with-azure (vtt &optional always-create filter)
        (with-current-buffer (find-file-noselect vtt)
          (let* (results
                 filename
                 (ids (my-subed-record-make-groups
                       (seq-filter
                        (lambda (o)
                          (if filter
                              (string-match filter (elt o 3))
                            'identity))
                        (subed-subtitle-list))))
                 id)
            (subed-for-each-subtitle (point-min) (point-max) t
              (unless (and (subed-subtitle-comment)
                           (string-match "#\\+SKIP" (subed-subtitle-comment))
                           (or (null filter)
                               (string-match filter (subed-subtitle-text))))
                (setq id (alist-get (subed-subtitle-msecs-start) ids))
                (setq filename (expand-file-name
                                (format "%s-%s.opus"
                                        (file-name-base vtt)
                                        id)
                                (file-name-directory vtt)))
                (when (or always-create (not (file-exists-p filename))) (subed-record-extract-audio-for-current-subtitle-to-file filename))
                (unless (string-match (regexp-quote "#+SCORE") (subed-subtitle-comment))
                  (my-subed-record-azure-assess-pronunciation))
                (unless (string-match (regexp-quote "#+PHONEMES") (subed-subtitle-comment))
                  (my-subed-record-allosaurus-phonemes))
                (let* ((comment (subed-subtitle-comment))
                       (scores (mapcar (lambda (o)
                                         (if (string-match (concat (regexp-quote o) ": \\([0-9]+\\)") comment)
                                             (match-string 1 comment)
                                           ""))
                                       '("#+WHISPER_SCORE" "#+SCORE" "A" "F" "C" "Conf")))
                       (phonemes (when (string-match "#\\+PHONEMES: \\(.+\\)" comment)
                                   (match-string 1 comment)))
                       (text (subed-subtitle-text))
                       (extra-comment (when (string-match "#\\+NOTE: \\(.+\\)" comment) (match-string 1 comment))))
                  (push
                   (append
                    (list (org-link-make-string (format "audio:%s?icon=t" filename) "▶️")
                          id
                          (or extra-comment ""))
                    scores
                    (list
                     (org-link-make-string (concat "abbr:" text) phonemes)))
                   results))))
            (setq results
                  (cons
                   '("File" "ID" "Comments" "WhisperX" "All" "Acc" "Flu" "Comp" "Conf" "Phonemes")
                   results))
            (my-org-table-remove-blank-columns results t))))
      
      (defun my-org-table-remove-blank-columns (data &optional has-header)
        "Remove blank columns from DATA.
      Skip the first line if HAS-HEADER is non-nil."
        (cl-loop
         for i from (1- (length (car results))) downto 1
         do (unless (delq nil
                          (mapcar
                           (lambda (o)
                             (and (elt o i)
                                  (not (string= (elt o i) ""))))
                           (if has-header
                               (cdr data)
                             data)))
              (setq data
                    (mapcar
                     (lambda (o)
                       (seq-remove-at-position o i))
                     data))))
        data)
      

      Collecting segments from multiple sessions

      First I collect a sample from different files

      You can e-mail me at sacha@sachachua.com.

    14. 🔗 HexRaysSA/plugin-repository commits sync repo: +2 releases rss
      sync repo: +2 releases
      
      ## New releases
      - [DeepExtract](https://github.com/marcosd4h/DeepExtractIDA): 0.9.10
      - [IDAGuides](https://github.com/libtero/idaguides): 1.3.0
      
    15. 🔗 r/LocalLLaMA Nemotron 3 Super Released rss
    16. 🔗 r/Leeds Positive Impact: South Leeds Shops Seeing Less Crime rss

      Some good news from South Leeds around safety and security:

      • The Yorkshire Evening Post recently reported that shops and retail parks are seeing a drop in anti-social behaviour and retail crime.
      • West Yorkshire Police and local partners have been using injunctions, community warnings, and early intervention.
      • Businesses are reporting fewer incidents, and staff and customers feel more confident.

      It’s a great example of how visible, targeted policing and collaboration can make a real difference in keeping retail areas safe.

      submitted by /u/securitycompanyuk
      [link] [comments]

    17. 🔗 r/york York residents – a short survey on the Bootham Crescent stadium relocation rss

      Hi everyone,

      I’m currently conducting research for my university dissertation on the relocation of the stadium at Bootham Crescent and how it has affected local communities and perceptions of the surrounding area.

      As part of this research, I’ve created a short 10-minute anonymous survey looking at the social, physical, and wider perceptions of the stadium move. I’m looking for responses from People who lived near Bootham Crescent before the move Current residents in the area Residents elsewhere in York People who have moved to York in recent years

      All responses are completely anonymous and will be used solely for academic research.

      If you have a few minutes, I would really appreciate your help by completing the survey below:

      https://qualtricsxmw68qycjfg.qualtrics.com/jfe/form/SV_bqMry2RhFEP6z5k

      Thank you very much for your time — every response really helps with the research.

      submitted by /u/_samjustice
      [link] [comments]

    18. 🔗 r/Leeds Schiacciata Sandwiches rss

      Was over in Manchester last week and had an amazing schiacciata sandwich at Ad Maiora (https://www.instagram.com/admaioramcr) does anyone know anywhere in Leeds that does really good Italian sandwiches? I know La Bottega Milanese does something similar but they are not made to order and look a bit sad in the glass cabinets after a while.

      submitted by /u/zharrt
      [link] [comments]

    19. 🔗 r/york Badminton 🏸 rss

      Anyone know of any casual badminton clubs in York? Used to play regularly a few years back at one of the clubs at the railway institute, but it's been a while and I'm very rusty!

      Or if you are solo and fancy a game, do shout! Happy for a pint/coffee after too.

      submitted by /u/CheekyChappie157
      [link] [comments]

    20. 🔗 r/Leeds Cheap Monstera Thai constellation in Leeds Kirkgate market rss

      Saw this in market garden shop yesterday didn’t get because I sadly got one last year and paid more price than this one.

      submitted by /u/Important_Sail4961
      [link] [comments]

    21. 🔗 r/LocalLLaMA M5 Max just arrived - benchmarks incoming rss

      M5 Max just arrived - benchmarks incoming | The M5 Max 128GB 14" has just arrived. I've been looking forward to putting this through its paces. Testing begins now. Results will be posted as comments below — no video, no lengthy writeup, just the raw numbers. Clean and simple. Apologies for the delay. I initially ran the tests using BatchGenerator, but the speeds weren't quite what I expected. I ended up setting up a fresh Python virtual environment and re-running everything with pure mlx_lm using stream_generate, which is what pushed the update back. I know many of you have been waiting - I'm sorry for keeping you! I take it as a sign of just how much excitement there is around the M5 Max.(I was genuinely hyped for this one myself.) Personally, I'm really happy with the results. What do you all think? Models Tested

      • Qwen3.5-122B-A10B-4bit
      • Qwen3-Coder-Next-8bit
      • Qwen3.5-27B-Claude-4.6-Opus-Distilled-MLX-6bit
      • gpt-oss-120b-MXFP4-Q8

      As for Qwen3.5-35B-A3B-4bit — I don't actually have that one downloaded, so unfortunately I wasn't able to include it. Sorry about that! Results were originally posted as comments, and have since been compiled here in the main post for easier access

      Qwen3.5-122B-A10B-4bit (mlx) cryingneko@MacBook-Pro mlx-lm % mlx_lm.generate --model /Volumes/SSD/Models/Qwen3.5-122B-A10B-4bit --prompt "$(cat /tmp/prompt_4096.txt)" --max-tokens 128 ========== Prompt: 4106 tokens, 881.466 tokens-per-sec Generation: 128 tokens, 65.853 tokens-per-sec Peak memory: 71.910 GB (mlx) cryingneko@MacBook-Pro mlx-lm % mlx_lm.generate --model /Volumes/SSD/Models/Qwen3.5-122B-A10B-4bit --prompt "$(cat /tmp/prompt_16384.txt)" --max-tokens 128 ========== Prompt: 16394 tokens, 1239.734 tokens-per-sec Generation: 128 tokens, 60.639 tokens-per-sec Peak memory: 73.803 GB (mlx) cryingneko@MacBook-Pro mlx-lm % mlx_lm.generate --model /Volumes/SSD/Models/Qwen3.5-122B-A10B-4bit --prompt "$(cat /tmp/prompt_32768.txt)" --max-tokens 128 ========== Prompt: 32778 tokens, 1067.824 tokens-per-sec Generation: 128 tokens, 54.923 tokens-per-sec Peak memory: 76.397 GB Qwen3-Coder-Next-8bit (mlx) cryingneko@MacBook-Pro mlx-lm % mlx_lm.generate --model /Volumes/SSD/Models/Qwen3-Coder-Next-8bit --prompt "$(cat /tmp/prompt_4096.txt)" --max-tokens 128 ========== Prompt: 4105 tokens, 754.927 tokens-per-sec Generation: 60 tokens, 79.296 tokens-per-sec Peak memory: 87.068 GB (mlx) cryingneko@MacBook-Pro mlx-lm % mlx_lm.generate --model /Volumes/SSD/Models/Qwen3-Coder-Next-8bit --prompt "$(cat /tmp/prompt_16384.txt)" --max-tokens 128 ========== Prompt: 16393 tokens, 1802.144 tokens-per-sec Generation: 60 tokens, 74.293 tokens-per-sec Peak memory: 88.176 GB (mlx) cryingneko@MacBook-Pro mlx-lm % mlx_lm.generate --model /Volumes/SSD/Models/Qwen3-Coder-Next-8bit --prompt "$(cat /tmp/prompt_32768.txt)" --max-tokens 128 ========== Prompt: 32777 tokens, 1887.158 tokens-per-sec Generation: 58 tokens, 68.624 tokens-per-sec Peak memory: 89.652 GB (mlx) cryingneko@MacBook-Pro mlx-lm % mlx_lm.generate --model /Volumes/SSD/Models/Qwen3-Coder-Next-8bit --prompt "$(cat /tmp/prompt_65536.txt)" --max-tokens 128 ========== Prompt: 65545 tokens, 1432.730 tokens-per-sec Generation: 61 tokens, 48.212 tokens-per-sec Peak memory: 92.605 GB (mlx) cryingneko@MacBook-Pro mlx-lm % mlx_lm.generate --model /Volumes/SSD/Models/Qwen3-Coder-Next-8bit --prompt "$(cat /tmp/prompt_16384.txt)" --max-tokens 128 ========== Prompt: 16393 tokens, 1802.144 tokens-per-sec Generation: 60 tokens, 74.293 tokens-per-sec Peak memory: 88.176 GB (mlx) cryingneko@MacBook-Pro mlx-lm % mlx_lm.generate --model /Volumes/SSD/Models/Qwen3-Coder-Next-8bit --prompt "$(cat /tmp/prompt_32768.txt)" --max-tokens 128 ========== Prompt: 32777 tokens, 1887.158 tokens-per-sec Generation: 58 tokens, 68.624 tokens-per-sec Peak memory: 89.652 GB (mlx) cryingneko@MacBook-Pro mlx-lm % mlx_lm.generate --model /Volumes/SSD/Models/Qwen3-Coder-Next-8bit --prompt "$(cat /tmp/prompt_65536.txt)" --max-tokens 128 ========== Prompt: 65545 tokens, 1432.730 tokens-per-sec Generation: 61 tokens, 48.212 tokens-per-sec Peak memory: 92.605 GB Qwen3.5-27B-Claude-4.6-Opus-Distilled-MLX-6bit (mlx) cryingneko@MacBook-Pro mlx-lm % mlx_lm.generate --model /Volumes/SSD/Models/Qwen3.5-27B-Claude-4.6-Opus-Distilled-MLX-6bit --prompt "$(cat /tmp/prompt_4096.txt)" --max-tokens 128 ========== Prompt: 4107 tokens, 811.134 tokens-per-sec Generation: 128 tokens, 23.648 tokens-per-sec Peak memory: 25.319 GB (mlx) cryingneko@MacBook-Pro mlx-lm % mlx_lm.generate --model /Volumes/SSD/Models/Qwen3.5-27B-Claude-4.6-Opus-Distilled-MLX-6bit --prompt "$(cat /tmp/prompt_16384.txt)" --max-tokens 128 ========== Prompt: 16395 tokens, 686.682 tokens-per-sec Generation: 128 tokens, 20.311 tokens-per-sec Peak memory: 27.332 GB (mlx) cryingneko@MacBook-Pro mlx-lm % mlx_lm.generate --model /Volumes/SSD/Models/Qwen3.5-27B-Claude-4.6-Opus-Distilled-MLX-6bit --prompt "$(cat /tmp/prompt_32768.txt)" --max-tokens 128 ========== Prompt: 32779 tokens, 591.383 tokens-per-sec Generation: 128 tokens, 14.908 tokens-per-sec Peak memory: 30.016 GB (mlx) cryingneko@MacBook-Pro mlx-lm % mlx_lm.generate --model /Volumes/SSD/Models/Qwen3.5-27B-Claude-4.6-Opus-Distilled-MLX-6bit --prompt "$(cat /tmp/prompt_65536.txt)" --max-tokens 128 ========== Prompt: 65547 tokens, 475.828 tokens-per-sec Generation: 128 tokens, 14.225 tokens-per-sec Peak memory: 35.425 GB gpt-oss-120b-MXFP4-Q8 (mlx) cryingneko@MacBook-Pro mlx-lm % mlx_lm.generate --model /Volumes/SSD/Models/gpt-oss-120b-MXFP4-Q8 --prompt "$(cat /tmp/prompt_4096.txt)" --max-tokens 128 ========== Prompt: 4164 tokens, 1325.062 tokens-per-sec Generation: 128 tokens, 87.873 tokens-per-sec Peak memory: 64.408 GB (mlx) cryingneko@MacBook-Pro mlx-lm % mlx_lm.generate --model /Volumes/SSD/Models/gpt-oss-120b-MXFP4-Q8 --prompt "$(cat /tmp/prompt_16384.txt)" --max-tokens 128 ========== Prompt: 16452 tokens, 2710.460 tokens-per-sec Generation: 128 tokens, 75.963 tokens-per-sec Peak memory: 64.857 GB (mlx) cryingneko@MacBook-Pro mlx-lm % mlx_lm.generate --model /Volumes/SSD/Models/gpt-oss-120b-MXFP4-Q8 --prompt "$(cat /tmp/prompt_32768.txt)" --max-tokens 128 ========== Prompt: 32836 tokens, 2537.420 tokens-per-sec Generation: 128 tokens, 64.469 tokens-per-sec Peak memory: 65.461 GB
      

      submitted by /u/cryingneko
      [link] [comments]
      ---|---

    22. 🔗 r/Yorkshire What a Yorkshire sunrise! rss
    23. 🔗 r/Leeds Drop off at Leeds train station rss

      morning ! My partner dropped me off at the train station earlier at the section of road where the spoons is (where taxis used to drop you off). Is this where people are allowed to be dropped off in a private vehicle, or will we receive a penalty? He stopped right by a zebra crossing for a pedestrian and I jumped out.

      Sorry am a bit stressed out as have been appealing parking charges at Manchester airport and don’t want to have to go through that again 😅

      submitted by /u/lcwj
      [link] [comments]

    24. 🔗 r/LocalLLaMA New benchmark just dropped. rss

      New benchmark just dropped. | Write the complete Three.js code for a scene featuring Michael Jackson, Pepe the Frog, Donald Trump, and Elon Musk performing the "Thriller" choreography, aiming for maximum visual perfection, detailed animation, lighting, high-quality rendering, and an overall cinematic. submitted by /u/ConfidentDinner6648
      [link] [comments]
      ---|---

    25. 🔗 r/Yorkshire Be honest is Yorkshire Tea actually the best tea? rss

      This might be a controversial question, but I’m curious where people stand on this.

      submitted by /u/1ChanceChipmunk1
      [link] [comments]

    26. 🔗 r/york Are there any known pubs with employee accommodation? rss
    27. 🔗 r/reverseengineering Anker/EufyMake UV Printer software RE (ongoing) rss
  3. March 10, 2026
    1. 🔗 IDA Plugin Updates IDA Plugin Updates on 2026-03-10 rss

      IDA Plugin Updates on 2026-03-10

      New Releases:

      Activity:

      • binlex
      • capa
        • c03d833a: rules: handle empty or invalid YAML documents in Rule.from_yaml (#2903)
        • 1f4a16cb: loader: skip PE files with unrealistically large section virtual size…
        • 2c9e30c3: perf: eliminate O(n²) tuple growth and reduce per-match overhead (#2890)
        • 8c138e3d: loader: handle struct.error from dnfile and raise CorruptFile with a …
        • a11a03bc: build(deps): bump minimatch and editorconfig in /web/explorer (#2892)
      • ghidra
      • ida-hcli
        • cfce4917: feat: add KE download support to ida:// protocol handler
      • idaguides
        • 257c6b4d: added: Config class changed: calculating hexrays indent from pseudoco…
      • msc-thesis-LLMs-to-rank-decompilers
      • playlist
      • python-elpida_core.py
        • d7c70473: BUG 13: Word-boundary matching for Parliament veto triggers + signal …
        • 691c7ca0: BUG 12: D15 cooldown init imposed 50-cycle dead zone
        • 19d74449: BUG 11: Fix D15 permanent blackout during recursion-breaking
        • 87543ee4: BUG 10: Break meta-diagnostic doom loop + rotate tension templates
        • af32708f: BUG 9: Filter irrelevant world feed items + expand Wikipedia noise fi…
        • a73132f2: BUG 8: Strip metadata from LLM escalation text + enhanced diagnostics
        • 48098e74: BUG 7b/7c: Strip metadata from signal detection + diagnostic capture
        • 6898ea12: BUG 6c: Fix A7/A10 keywords — violation indicators, not topic words
        • 06a7fcb9: BUG 6b: Decouple semantic scores from violation signals
        • 5c6ef385: Section 21: BUG 6 semantic embedding implementation — parliament sees…
        • 31023936: BUG 6: Semantic embedding layer for parliament signal detection
        • 71facde0: Section 20: Deep breath checkpoint — Body 14 recovery proof, axiom ge…
      • symbolicator
        • 31804763: chore: add 25.4 KEXTs
        • 87aac767: chore: add xnu 25.4 🎉
        • 7751e6ed: chore: put version.max back to 25.4 for darwin 25.3
        • ed380b9f: chore: update 25.3
        • 68c2cad2: chore: re-run xnu 25.3 w/ 26.3.1 KDK (and make work on 26.4 beta for …
    2. 🔗 r/LocalLLaMA 1 million LocalLLaMAs rss

      1 million LocalLLaMAs | it took just 3 years submitted by /u/jacek2023
      [link] [comments]
      ---|---

    3. 🔗 r/reverseengineering Released a crackme this week. Someone reconstructed the hash in Python, brute forced for an hour - then patched the jump. That was the correct solution. rss
    4. 🔗 r/wiesbaden Bin neu hier und Suche Freunde rss

      Hey, ich bin 18 und männlich und bin vor ca. einer Woche nach Wiesbaden gezogen. Ich Suche Freunde, Leute mit denen ich feiern gehen kann und einfach Zeit verbringen kann. Schreibt mich gerne an wenn ihr mal einen Drink trinken gehen wollt und mir die Stadt zeigen könnt und wollt :)

      submitted by /u/Fair-Prune2346
      [link] [comments]

    5. 🔗 r/Harrogate Wedding dress rss

      Hey im going shopping for the above with/for my sister to help give advice. Is there any places you would recomend in and around hgate. Were going on thursday, i havent much knowledge on the area if there was any help or tips for how to be a good supporter!

      submitted by /u/ProperSignificance24
      [link] [comments]

    6. 🔗 r/Yorkshire The view down Lombards Wynd, Richmond, Nth Yorkshire. rss

      The view down Lombards Wynd, Richmond, Nth Yorkshire. | In early medieval times St Martins Frairy ran a woolen mill at the foot of this Wynd. Wool merchants from Lombardy would would bring their goods to trade tracking down the hill to the mill, hence Lombards Wynd. Today the former CofE Primary School sits near the foot of the hil with Easby Abbey visible in the distance. submitted by /u/Still_Function_5428
      [link] [comments]
      ---|---

    7. 🔗 r/LocalLLaMA I regret ever finding LocalLLaMA rss

      I regret ever finding LocalLLaMA | It all started with using "the AI" to help me study for a big exam. Can it make some flashcards or questions? Then Gemini. Big context, converting PDFs, using markdown, custom system instruction on Ai Studio, API. Then LM Studio. We can run this locally??? Then LocalLLama. Now I'm buying used MI50s from China, quantizing this and that, squeezing every drop in REAP, custom imatrices, llama forks. Then waiting for GLM flash, then Qwen, then Gemma 4, then "what will be the future of Qwen team?". Exam? What exam? In all seriousness, i NEVER thought, of all things to be addicted to (and be so distracted by), local LLMs would be it. They are very interesting though. I'm writing this because just yesterday, while I was preaching Qwen3.5 to a coworker, I got asked what the hell was I talking about and then what the hell did I expected to gain from all this "local AI" stuff I talk so much about. All I could thought about was that meme. https://preview.redd.it/o7e97f302aog1.png?width=932&format=png&auto=webp&s=98e0f8f9bd30bb9c49c18e3b7ed03751d605cc86 submitted by /u/xandep
      [link] [comments]
      ---|---

    8. 🔗 r/LocalLLaMA Qwen3.5-35B-A3B Uncensored (Aggressive) — GGUF Release rss

      The one everyone's been asking for. Qwen3.5-35B-A3B Aggressive is out!

      Aggressive = no refusals; it has NO personality changes/alterations or any of that, it is the ORIGINAL release of Qwen just completely uncensored

      https://huggingface.co/HauhauCS/Qwen3.5-35B-A3B-Uncensored-HauhauCS- Aggressive

      0/465 refusals. Fully unlocked with zero capability loss.

      This one took a few extra days. Worked on it 12-16 hours per day (quite literally) and I wanted to make sure the release was as high quality as possible. From my own testing: 0 issues. No looping, no degradation, everything works as expected.

      What's included:

      - BF16, Q8_0, Q6_K, Q5_K_M, Q4_K_M, IQ4_XS, Q3_K_M, IQ3_M, IQ2_M

      - mmproj for vision support

      - All quants are generated with imatrix

      Quick specs:

      - 35B total / ~3B active (MoE — 256 experts, 8+1 active per token)

      - 262K context

      - Multimodal (text + image + video)

      - Hybrid attention: Gated DeltaNet + softmax (3:1 ratio)

      Sampling params I've been using:

      temp=1.0, top_k=20, repeat_penalty=1, presence_penalty=1.5, top_p=0.95, min_p=0

      But definitely check the official Qwen recommendations too as they have different settings for thinking vs non-thinking mode :)

      Note: Use --jinja flag with llama.cpp. LM Studio may show "256x2.6B" in params for the BF16 one, it's cosmetic only, model runs 100% fine.

      Previous Qwen3.5 releases:

      - Qwen3.5-4B Aggressive

      - Qwen3.5-9B Aggressive

      - Qwen3.5-27B Aggressive

      All my models: HuggingFace HauhauCS

      Hope everyone enjoys the release. Let me know how it runs for you.

      The community has been super helpful for Ollama, please read the discussions in the other models on Huggingface for tips on making it work with it.

      submitted by /u/hauhau901
      [link] [comments]

    9. 🔗 r/reverseengineering Your Duolingo Is Still Talking to ByteDance: How Pangle Fingerprints You Across Apps After You Said No rss
    10. 🔗 r/reverseengineering Reverse Engineering Binaries With AI rss
    11. 🔗 sacha chua :: living an awesome life La semaine du 2 mars au 8 mars rss

      lundi 2 mars

      J'ai préparé ma newsletter sur Emacs et j'ai écrit un article sur l'affichage d'indices pour des raccourcis clavier. J'ai aussi essayé l'expansion des snippets par commande vocale. Je pense que l'expansion des snippets est utile parce que quand j'insère un snippet à partir d'initiales, je dois penser à l'expression et puis penser aux lettres initiales, mais quand j'insère un snippet par commande vocale, je peux utiliser l'expression naturelle. Bien sûr, il y a un bref délai pour la transcription, mais c'est suffisamment court pour ne pas couper le fil de mes pensées.

      Ma fille était trop fatiguée pour son cours de gymnastique, donc je l'ai emmenée chez la dentiste pour un examen à cause de sa douleur dentaire. La dentiste a dit que ses gencives sont un peu enflées. Elle nous a recommandé de ramollir sa brosse à dents sous l'eau chaude avant de se brosser les dents et peut-être d'utiliser un bain de bouche salin. Ma fille s'est plainte que ses dents semblent trop serrées. La dentiste a dit que c'est acceptable pour le moment, et si nous voulons, elle peut nous orienter vers un orthodontiste. Quand j'étais plus jeune, je ne supportais pas l'appareil dentaire, mais c'est possible que ma fille puisse le supporter. Je pense que c'est mieux que nous attendions que le pic de concentration virale dans les eaux usées soit passé.

      Après la vaisselle et ma routine du soir, ma fille et moi avons cousu à la main notre projet de petit sac avec quelques poches.

      mardi 3 mars

      J'ai travaillé sur les virelangues pendant le rendez-vous avec mon tuteur. Les sons « r » et « u » ont continué à me poser des difficultés. Je vais travailler sur la différence entre « roue » et « rue », le mot « brume », et quelques autres. Il a dit que le « r » a besoin de moins d'air.

      Les résultats aujourd'hui :

      Je me demande quel serait une bonne méthode et une bonne interface pour m'entraîner seule à la prononciation entre les rendez-vous avec mon tuteur. Je pense que le processus comprend les étapes suivantes :

      1. Apprendre à écouter la différence entre l'exemple et un énoncé incorrect : il s'agit d'abord de distinguer qu'ils sont différents, puis de comprendre pourquoi.
        • Si j'extrais les énoncés de mes enregistrements et que je les annote avec les classifications de mon tuteur, je peux les utiliser pour l'apprentissage supervisé afin d'exercer mon oreille. Ces enregistrements seront trop ennuyeux pour d'autres, mais pour moi, il vaut peut-être mieux que je les écoute pour mieux apprendre.
      2. Identifier lequel des deux énoncés est le meilleur.
        • Je peux randomiser les courts enregistrements de l'étape précédente pour créer un jeu.
      3. Essayer de produire des sons variés. Il faut m'entraîner, il n'y a évidemment pas d'autre solution.
      4. Écouter la différence entre l'exemple et le son que j'ai produit. Déterminer si le son est assez bon. Réfléchir à la connexion entre les mouvements de la bouche et le son qu'ils produisent.
      5. Produire le son de manière isolée. Connecter la sensation interne de produire le son avec le son que je veux produire, parce que le son que j'enregistre diffère du son que j'écoute en parlant.
      6. Produire le son systématiquement.
      7. Produire le son même si je n'écoute pas de modèle et je ne viens pas de le répéter.
      8. Utiliser le son dans le contexte d'une expression avec des pauses.
      9. Dire l'expression plus fluidement.
      10. Dire l'expression sans exemple.

      Si c'était un problème résolu facilement, tout le monde utiliserait et recommanderait la solution. Je pense qu'il n'y a pas de bonne solution sur le marché à l'exception de la méthode que j'ai utilisée pour la formation de mon petit projet d'intelligence humaine générale (qui a 10 ans maintenant, comme elle me le dit souvent) : une quantité massive de données. Mais bien sûr, il y a beaucoup de recherches dont je peux profiter.

      Oooh, j'ai hâte d'essayer des spectrogrammes en plus des formes d'onde. Il y a quelques logiciels qui peuvent afficher les spectrogrammes même en temps réel. C'est possible que ça facilite l'analyse des voyelles.

      Donc, je peux utiliser les horodatages par mot de WhisperX pour segmenter l'enregistrement. Mais je dois les écouter dans le contexte du rendez-vous pour les associer avec les commentaires de mon tuteur, sauf si la segmentation par locuteur est fiable pour identifier quels énoncés ont obtenu un « oui » ou « c'est mieux » de mon tuteur et quels énoncés lui font dire « non ». Pour le moment, je pense que c'est plus fiable si j'écoute la conversation et annote les segments moi-même, donc une interface qui affiche les formes d'onde segmentées et me permet de faire des sélections par raccourcis clavier serait utile. Si les scores sont disponibles, les afficher sous forme de graphique à barres est peut-être plus précis et plus facile à comparer que les afficher à l'aide d'un dégradé de couleurs. Je peux aller voir du côté de Label Studio ou Praat pour des idées à implémenter sur Emacs. Ou bien, si j'utilise Audino 2.0 ou d'autres projets similaires sur le web, je peux les annoter pendant mes moments perdus.

      Pendant la pratique, je pense que mon interface doit lancer l'enregistrement de mon tuteur et peut-être afficher la forme d'onde ou le spectrogramme. Elle doit enregistrer ma voix, puisqu'elle doit lancer la lecture de l'exemple du tuteur et l'enregistrement de ma voix pour comparaison avec le score de confiance de WhisperX. Des raccourcis clavier lancent l'un ou l'autre.

      Notre réseau

      Mon tuteur a une question sur les réseaux informatiques, donc je vais profiter de cette occasion pour expliquer notre réseau en français afin d'apprendre plusieurs mots techniques en cours de route. Mon mari est principalement responsable de l'entretien de notre réseau, mais je devrais également m'y former.

      Mon mari a recommandé des ressources pour les gens intéressés :

      • Jim's Garage : recommandé vivement, mais le Homelab 2.0 dont il a discuté dans les vidéos récentes commence à coûter cher.
      • Serve the Home
      • Reddit, bien sûr

      Notre réseau :

      • Notre modem fibre optique du FAI se connecte à un mini-ordinateur Lenovo M920q qui fait fonctionner Proxmox pour la gestion de pare-feu et quelques machines virtuelles. Une des machines virtuelles est OPNSense, qui gère les adresses réseau, le pare-feu, le lissage du trafic réseau (y compris la règle consistant à couper l'accès à internet de notre enfant tard le soir) et divers réseaux virtuels (VLAN) pour isoler les différents appareils via l'adaptateur réseau Gigabit Intel 893647. L'Internet des objets manque souvent de mises à jour, donc mon mari veut les isoler de nos autres ordinateurs. OPNSense lui-même reçoit des mises à jour. En fait, mon mari l'a mis à jour récemment, et il est passé de 16 à 32 gigaoctets de RAM. Mon mari a dit qu'il apprécie que le Lenovo M920q soit assez silencieux.
      • Le M920Q se connecte à un commutateur réseau ASUS GS108Tv2, qui se connecte au Synology DS718+ pour le stockage réseau et à l'Odroid-XU4 qui fait aussi fonctionner PiHole pour réduire les publicités. Proxmox sur le M920q a aussi une machine virtuelle qui est responsable de sauvegarder les fichiers sur le Synology DS718+.
      • Le commutateur réseau ASUS GS108Tv2 se connecte au routeur wifi ASUS RT-AC66U qui utilise FreshTomato pour avoir plus de contrôle qu'avec le modem fibre optique. Il est capable de wifi 5 GHz et il peut traiter les réseaux wifi virtuels (deux ou plus de SSIDs dans la même bande 2,4 GHz ou 5 GHz) pour isoler les appareils comme le thermostat. De cette façon, les appareils fiables comme nos ordinateurs ne sont pas visibles par les appareils non sécurisés.
      • Le routeur wifi se connecte à un commutateur réseau non géré qui se connecte à un Odroid-C4 qui utilise OpenELEC et à notre vieille Sony PS3.

      Nous utilisions le routeur wifi ASUS RT-AC66U avec FreshTomato pour notre réseau, mais mon mari a mis à niveau vers le Lenovo M920q pour faciliter la gestion des réseaux virtuels et pour optimiser le débit. Il a dit qu'il avait choisi les composants pour minimiser l'espace, la consommation d'énergie et le bruit. Rien n'est neuf et tout peut être acheté sur Ebay ou le marché de l'occasion. Pour le moment, la RAM et le stockage coûtent très cher, et nous n'avons pas besoin de haute disponibilité ou réplication.

      network.png

      Après l'école, ma fille a eu de l'énergie, donc je l'ai emmenée à un cours de rattrapage de gymnastique. C'était un cours collectif de tissu aérien. Pendant que ma fille participait en classe, j'ai étudié mes cartes Anki. Elle a globalement aimé le cours à l'exception de ses chaussettes perdues. Malheureusement, quelqu'un a pris les chaussettes de ma fille au lieu des siennes. Je me suis retenue de dire qu'elle aurait dû me donner ses affaires à garder.

      mercredi 4 mars

      J'ai écrit un article sur l'expansion de snippets par la reconnaissance vocale sur Emacs et sur d'autres applications.

      J'ai essayé le bilan de prononciation d'Azure et la transcription des phonèmes par la bibliothèque Allosaurus, mais je pense que ceux-ci ne sont ni fiables ni adaptés à mes objectifs. Je ne sais pas si les scores d'Azure sont utiles. Allosaurus ne me donne pas l'API que je veux, même si j'analyse l'enregistrement de mon tuteur. (Je dois le vérifier avec le résultat de la synthèse vocale…)

      Le cours phonologique de FSI contraste deux exemples courts similaires pour développer la compétence d'identification des différences. Pour le moment, mieux vaut améliorer mon processus pour extraire et écouter les segments vocaux de mon rendez-vous que de s'entraîner d'une façon peu fiable et probablement incorrecte mais avec assurance.

      Ma fille et moi avons fait des courses. Après une pause, ma fille et moi sommes allées au parc pour jouer à Pokémon Go avec beaucoup d'autres dresseurs. Nous avons gagné quelques raids, mais ma fille n'a pas attrapé les Pokémons qu'elle voulait. Elle était un peu déçue, mais elle a dit que c'était une bonne promenade de toute façon.

      Ma fille était de mauvaise humeur à l'heure du coucher à cause de mon conseil pendant le brossage. Je suis restée calme et je lui ai donné de l'espace.

      jeudi 5 mars

      Ma fille s'est réveillée toute seule ce matin et elle a pris son petit-déjeuner, mais elle n'a pas voulu assister à ses cours en ligne. La harceler n'est pas utile, donc je l'ai laissée gérer ses propres émotions. J'ai travaillé sur le piano. J'ai aussi amélioré l'automatisation pour rassembler les jalons de distribution pour la Bike Brigade en utilisant Spookfox. J'ai découvert que la clé est d'utiliser le code

      document.querySelector('form[phx-change="update_options"]')
        .dispatchEvent(new Event('submit', {bubbles: true, cancelable:true}))
      

      pour mettre le tableau à jour après avoir changé les dates. Spookfox ne me permet pas d'attendre le résultat s'il prend du temps, donc je dois attendre dans Emacs Lisp comme ça :

      (let (result)
        (dolist (block-name '("milestone-this-month-set"
                              "milestone-this-month-get"
                              "milestone-before-month-set"
                              "milestone-before-month-get"
                              "milestone-after-month-set"
                              "milestone-after-month-get"
                              "milestone-summary"))
          (setq result
                 (org-babel-execute-src-block
                  nil
                  (org-babel-lob--src-info block-name)
                  nil 'babel-call))
          (when (string-match "-set" block-name)
            (message "Waiting after %s..." block-name)
            (sit-for 3)))
        (kill-new result)
        (message "Copied."))
      

      De cette façon, j'ai simplifié le processus pour réduire le nombre de clics. Le code complet est ici.

      vendredi 6 mars

      J'ai adoré travailler sur ma prononciation via mes notes sur notre réseau sur lequel mon tuteur m'avait interrogée mardi et mon mari m'avait aidée. J'ai besoin de travailler encore sur l'alphabet, qui est nécessaire pour lire les noms de modèles à voix haute. Mon tuteur a aussi des questions sur les LLM. J'ai hâte d'écrire plus de notes.

      Nous avons réarrangé des meubles parce que le nouveau lit arrive demain pour notre fille. Nous avons déplacé les étagères dans la chambre de ma fille dans un coin qui est mon nouvel espace bureau.

      Ma fille était trop frustrée par l'école aujourd'hui. Elle a séché ses cours, et elle a voulu rentrer plus tôt de sa sortie avec son amie. Je pense que cette journée était un peu difficile pour elle. Je me suis rappelé de penser sur le long terme, sans harcèlement.

      samedi 7 mars

      Ma fille et moi avons joué à Donjons et Dragons avec mes sœurs et mes nièces. Nous avons bien aimé la partie. Dans l'histoire, il y avait des kobolds qui habitent dans une des Cavernes du Chaos et qui regrettent d'avoir attrapé un ours. L'ours avait très faim et les kobolds aussi, parce que les kobolds lui donnent leur nourriture pour éviter d'avoir mal. La clerc (ma fille) et la guerrière (une de mes nièces) ont réussi à attirer l'ours dehors avec des bleuets. Ma sœur la magicienne a mené la charge contre des maraudeurs qui habitaient dans une autre caverne, et nous les avons vaincus. Dans une chambre, nous avons vu deux coffres, mais nous avons trouvé qu'un coffre était en fait un imitateur. Après un autre combat, nous avons trouvé 150 pièces d'or, des bottes et une potion mystérieuse.

      Après le déjeuner, ma fille et moi avons fait une promenade au parc pendant que nous jouions à Pokémon Go. Il faisait beau avec beaucoup de brume qui semblait un peu magique.

      Puis, mon mari et moi avons démonté l'ancien lit de ma fille et quelques autres meubles dans sa chambre pour créer de l'espace pour son nouveau lit.

      dimanche 8 mars

      Ma fille a réussi à éviter de tomber de son nouveau lit mezzanine. Succès ! Mon mari a fini de poncer et de vernir le garde-corps qu'il fabriquait en bois, donc il l'a installé pour nous permettre d'utiliser le matelas qui est trop épais pour le garde-corps original.

      J'ai commencé à externaliser mon code dans un nouveau package d'apprentissage des langues. Je ne sais pas s'il est utile aux autres, mais si je veux aider les autres à essayer, il a besoin d'un peu de travail.

      Il faisait très beau. Mon mari, ma fille et moi sommes allés à IKEA pour acheter des coussins, des lumières et un tapis de gym pour le petit coin jeu sous le nouveau lit de ma fille. Pendant ce temps-là, ma fille a vu un couteau qu'elle a aimé, donc nous l'avons acheté aussi. À la maison, elle a installé le tapis et les coussins elle-même. Elle a décidé de rapporter les lumières pour se faire rembourser la semaine prochaine.

      Pour le dîner, nous avons préparé des nuggets de poulet, des frites et du brocoli.

      Sur l'intelligence artificielle

      Dans le rendez-vous précédent, mon tuteur m'a posé des questions sur l'intelligence artificielle. Je veux réfléchir sur l'IA pour travailler ma prononciation en utilisant un sujet qui nous intéresse également, et pour trouver des points d'amélioration.

      D'abord, du contexte pour expliquer ma perspective :

      • Je laisse de côté les questions sur l'impact environnemental ou l'éthique des données entrantes.
      • Jusqu'à présent, j'ai essayé l'IA pour mes centres d'intérêt comme la parentalité, l'apprentissage du français et la programmation en Emacs Lisp, en Python et en Javascript. Je l'ai aussi utilisée pour faire des recherches.
      • Je travaille seulement un peu comme consultante, mais en fait, c'est juste pour le plaisir. Je ne veux pas augmenter ma charge de travail parce que je me concentre sur ma fille et mes intérêts personnels. Rien ne me presse d'utiliser l'IA (comme un chef, des clients ou des concurrents). L'IA ne me menace pas. Je peux l'utiliser ou ne pas l'utiliser, à mon gré. Je peux me focaliser sur mon bonheur.
      • Je peux consacrer une petite partie de mon budget à des essais, mais je ne veux pas travailler davantage pour rentabiliser une dépense plus importante. Pour le moment, les limites d'utilisation gratuite de Gemini, de Claude et d'Azure suffisent pour mes idées et mon temps limité. Je n'ai pas le temps de concentration nécessaire pour justifier l'investissement dans mon propre matériel, et sinon, les progrès sont trop rapides pour m'engager dans une configuration spécifique.
      • J'ai une conscience aiguë des limites cognitives ou physiques à cause des difficultés de santé de ma mère et de ma sœur, et de mes expériences avec mes limitations à cause du fait que je suis la personne principalement en charge de ma fille.
      • Je lis très vite, mais je n'ai pas assez de patience pour les longs contenus vidéo ou audio. Je n'aime pas les textes qui contiennent beaucoup de remplissage.
      • J'aime la programmation, donc je comprends un peu comment l'IA fonctionne et je ne peux pas lui attribuer une vraie intelligence. Je n'aime pas non plus les résultats imprévisibles.
      • De mon côté, c'est facile de lancer beaucoup d'idées. C'est difficile de les mener à terme. Je peine à finaliser mes tâches parce que de nouvelles idées arrivent sans cesse. Mais presque aucune de mes tâches n'est vraiment nécessaire, donc ce n'est pas grave.
      • J'aime bien l'amélioration incrémentale. Je préfère les petites étapes, les petites fonctions, les petits logiciels.
      • Beaucoup de gens ont une réaction forte contre l'IA pour plusieurs raisons qui incluent le battage médiatique excessif dont elle fait l'objet, son utilisation à mauvais escient, et l'inondation de banalité qu'elle produit.
      La programmation

      Pour la programmation, je trouve qu'elle fonctionne mieux pour les logiciels courts que pour les logiciels longs. Je réécris souvent la majorité du logiciel à l'exception d'un ou deux morceaux parce que ce code ne me convient pas. De temps en temps, j'utilise l'IA pour parfaire ou vérifier une idée rapidement avant de travailler sur l'idée moi-même. Je ne veux pas l'utiliser pour les correctifs que je veux soumettre à d'autres projets parce que le code ne me semble pas correct et je ne veux pas gaspiller le temps d'autres bénévoles.

      Quelques exemples concrets :

      • C'était utile pour implémenter une fonction qui compare deux listes et renvoie les éléments ajoutés, enlevés, ou modifiés via un algorithme classique que je comprends un peu mais pas suffisamment pour l'implémenter moi-même.
      • C'était utile pour tester l'idée d'un serveur de Kokoro TTS qui est compatible avec le serveur speechd parce que je ne sais pas encore comment faire un serveur multithread en Python. J'aime pouvoir lui donner trois dépôts git et des instructions pour générer un logiciel à partir d'un dépôt pour un autre via le troisième dépôt. Mais je ne veux pas le publier avant de réécrire et tout comprendre.
      • C'était utile pour générer des interfaces web pour mes idées personnelles.
      • Ce n'était pas très utile pour bricoler ma configuration (à l'exception d'identifier parfois des commandes ou des variables que je ne connais pas), parce que j'aime bien le bricolage. Spécifier mes objectifs demande souvent autant de travail que de les implémenter moi-même.

      Mon mari a son propre abonnement à Claude IA. Il a dit qu'il l'apprécie parce que l'IA peut gérer plusieurs petites tâches qui autrement nécessitent beaucoup de recherches. De mon côté, j'utilise souvent Gemini IA parce que sa limite d'utilisation gratuite est généreuse. J'ai aussi essayé Claude Code, mais mes connaissances sont limitées. Il semble utile, mais je préfère l'isoler dans une machine virtuelle, donc c'est peu pratique pour moi en ce moment.

      L'IA est très utile pour utiliser des commandes qui ont beaucoup d'options comme ffmpeg ou gnuplot.

      Je ne trouve pas l'IA assez fiable pour la laisser agir complètement indépendamment. Peut-être un jour, mais pour moi, pas encore.

      L'apprentissage du français

      J'aime utiliser l'IA pour me donner des retours sur mes textes. Si j'utilise seulement le dictionnaire, je ferai beaucoup d'anglicismes à cause de la traduction littérale. Les sujets qui m'intéressent sont un peu rares, donc ce sera peut-être difficile de trouver un tuteur qui se concentre exactement sur ceux-là. C'est un peu inefficace de corriger mon écriture mot à mot avec un professionnel. Mon journal et mes pensées ne sont pas si importants. Avec l'IA, je n'ai pas à perdre de temps avec mon tuteur pour corriger beaucoup d'erreurs comme l'accord du nom et du verbe ou les mots maladroits, et je découvre de nouveaux mots et expressions. Les suggestions de l'IA sont de temps en temps bizarres, donc c'est toujours une bonne idée de vérifier avec de vraies personnes. Sans l'IA, je pourrais peut-être apprendre plus lentement avec l'aide d'Internet, qui a beaucoup de ressources comme Vitrine linguistique.

      J'ai essayé l'IA pour faire des commentaires sur ma prononciation, mais je pense que ce n'est pas encore fiable et je n'ai pas l'expérience pour bien juger. Je peux peut-être vérifier mes résultats avec un tuteur, mais c'est peut-être difficile à cause des objectifs contradictoires, comme les personnes à qui l'on demande de former leurs remplaçants. En fait, je ne veux pas remplacer la connexion humaine. Je veux profiter davantage, apprendre davantage avec l'aide de vraies personnes, complétée par l'aide de l'IA. Il y a des chercheurs qui étudient les applications de l'IA à l'apprentissage des langues. Je peux attendre leurs découvertes. En attendant, je pense qu'il vaut mieux utiliser l'IA pour comprendre d'autres manières d'analyser la prononciation moi-même, et pour construire des outils personnalisés peut-être comme les résumés et les extraits de nos rendez-vous, les visualisations de mes tentatives, ou une interface pour enregistrer et écouter en temps réel.

      De temps en temps, j'essaye de générer des histoires ou des articles compréhensibles de mon niveau (ou presque). Pour le moment, je préfère d'autres ressources pour la lecture, comme les sous-titres d'émissions. Néanmoins, les traductions automatiques sur Reddit m'intéressent, donc j'ai réussi à remplacer mon fil d'actualité par un flux en français.

      Je ne suis pas encore prête à converser avec des IA par la voix. J'ai essayé la conversation libre et le dialogue presque scénarisé. J'adore les sous-titres simultanés, mais je n'ai pas toujours trouvé une méthode ou un système qui me convienne. Dans la conversation libre, je sais que l'interlocuteur est une IA, donc je n'ai pas une vraie curiosité pour ses «intérêts ou pensées». La conversation semblait très artificielle. En plus, je pense que je préférerais en construire un moi-même pour plus de contrôle. De toute façon, ma prononciation, ma grammaire et mon vocabulaire ont besoin de travail. Dans le dialogue scénarisé, je n'ai pas encore un vocabulaire assez riche pour discuter des sujets dans les exercices généraux. Si je répète simplement, je n'ai pas besoin d'IA pour ça.

      La parentalité

      J'ai parfois utilisé Claude IA pour générer des histoires interactives sur les centres d'intérêt de ma fille. Les histoires incluent les mots que ma fille doit apprendre pour sa classe. Elles permettent de taper sur un mot pour l'écouter par la synthèse vocale et pour voir la traduction. Elle aime bien ce format. L'enseignant de ma fille n'a pas le temps de personnaliser l'apprentissage du vocabulaire à ce point, et elle est trop imprévisible pour planifier ses propres rendez-vous avec un tuteur.

      Elle aime générer d'autres histoires interactives avec l'IA elle-même, comme des petits jeux sur KPop Demon Hunters ou Pokémon. Je pense que c'est une bonne façon de s'entraîner à réfléchir à ce qu'elle veut, comment l'expliquer et comment le peaufiner.

      Elle a 10 ans. Personne ne sait à quoi ressemblera vraiment le monde quand elle sera grande. Je pense que c'est mieux que mon mari et moi montrions comment approcher, comment apprendre, comment décider ce que nous pensons, sans peur ni battage publicitaire.

      Sans l'IA, nous pourrions improviser nos propres histoires. Mais je pense que la capacité de lui donner plus de contrôle dans une boucle de rétroaction1 rapide est une bonne chose.

      Je n'aime pas l'utiliser pour essayer de résoudre mes dilemmes de parentalité parce que l'IA confirme toujours quoi qu'on lui donne. De temps en temps, je l'utilise pour générer des questions pour réfléchir, ce qui est un peu plus utile.

      Mélanges

      J'aime bien la reconnaissance vocale parce qu'elle me permet de saisir plus d'idées plus vite (avant de les oublier) et d'analyser les transcriptions sans avoir à réécouter tous les enregistrements. Beaucoup de raisons peuvent empêcher une personne de taper. J'aime bien la programmation et l'écriture, et je veux continuer longtemps. J'ai hâte d'explorer des interfaces vocales.

      Je pense que la manière probabiliste que l'IA utilise est prometteuse pour chercher des choses que je ne sais pas exactement, ce qui sera très utile quand on a un brouillard cérébral. Je n'aime pas les résumés qui sont souvent mauvais et qui enlèvent l'expérience de rencontrer d'autres personnes qui pensent elles aussi des choses similaires. J'aime suivre les liens où je peux en apprendre davantage. J'aime aussi poser quelques questions à l'IA avant ou au lieu de demander à une vraie personne.

      Les étapes prochaines pour moi

      Je vais continuer à essayer l'IA dans mes centres d'intérêt. Je veux extraire mes fonctions personnelles dans des bibliothèques de reconnaissance vocale et d'apprentissage des langues pour aider les autres, mais j'avance lentement parce que mon attention est facile à détourner. Petit à petit.

      Je veux essayer les bibliothèques d'IA sous Emacs comme agent-shell. Si je peux approuver manuellement chaque commande, je pense que ce n'est pas grave.

      Footnotes

      1

      Feedback loop? My tutor was not sure about the wording.

      You can e-mail me at sacha@sachachua.com.

    12. 🔗 sacha chua :: living an awesome life Emacs Lisp and NodeJS: Getting the bolded words from a section of a Google Document rss

      : Simplified getting a section or finding the bolded text by using the Org Mode format instead.

      During the sessions with my French tutor, I share a Google document so that we can mark the words where I need to practice my pronunciation some more or tweak the wording. Using Ctrl+B to make the word as bold is an easy way to make it jump out.

      I used to copy these changes into my Org Mode notes manually, but today I thought I'd try automating some of it.

      First, I need a script to download the HTML for a specified Google document. This is probably easier to do with the NodeJS library rather than with oauth2.el and url-retrieve-synchronously because of various authentication things.

      require('dotenv').config();
      const { google } = require('googleapis');
      
      async function download(fileId) {
        const auth = new google.auth.GoogleAuth({
          scopes: ['https://www.googleapis.com/auth/drive.readonly'],
        });
        const drive = google.drive({ version: 'v3', auth });
        const htmlRes = await drive.files.export({
          fileId: fileId,
          mimeType: 'text/html'
        });
        return htmlRes.data;
      }
      
      async function main() {
        console.log(await download(process.argv.length > 2 ? process.argv[2] : process.env['DOC_ID']));
      }
      
      main();
      

      Then I can wrap a little bit of Emacs Lisp around it.

      (defvar my-google-doc-download-command
        (list "nodejs" (expand-file-name "~/bin/download-google-doc-html.cjs")))
      
      (defun my-google-doc-html (doc-id)
        (when (string-match "https://docs\\.google\\.com/document/d/\\(.+?\\)/" doc-id)
          (setq doc-id (match-string 1 doc-id)))
        (with-temp-buffer
          (apply #'call-process (car my-google-doc-download-command)
                 nil t nil (append (cdr my-google-doc-download-command) (list doc-id)))
          (buffer-string)))
      
      (defun my-google-doc-org (doc-id)
        "Return DOC-ID in Org Mode format."
         (let ((dom (with-temp-buffer
                     (insert (my-google-doc-html doc-id))
                     (libxml-parse-html-region))))
          ;; bold text is actually represented as font-weight:700 instead
          (dom-search
           dom
           (lambda (o)
             (when (and
                    (string-match "font-weight:700" (or (dom-attr o 'style) ""))
                    (not (string-match "font-style:normal" (or (dom-attr o 'style) ""))))
               (setf (car o) 'strong))
             (when (dom-attr o 'style)
               (dom-remove-attribute o 'style))))
          (with-temp-buffer
            (svg-print dom)
            (pandoc-convert-stdio (buffer-string) "html" "org"))))
      

      I have lots of sections in that document, including past journal entries, so I want to get a specific section by name.

      (defun my-org-get-subtree-by-name (org-text heading-name)
        "Return ORG-TEXT subtree for HEADING-NAME."
        (with-temp-buffer
          (insert org-text)
          (org-mode)
          (goto-char (point-min))
          (let ((org-trust-scanner-tags t))
            (car (delq nil
                       (org-map-entries
                        (lambda ()
                          (when (string= (org-entry-get (point) "ITEM") heading-name)
                            (buffer-substring (point) (org-end-of-subtree))))))))))
      

      Now I can get the bolded words from a section of my notes, with just a sentence for context. I use pandoc to convert it to Org Mode syntax.

      (defvar my-lang-words-for-review-context-function 'sentence-at-point)
      
      (defun my-lang-tutor-notes (section-name)
        (my-org-get-subtree-by-name
         (my-google-doc-org my-lang-tutor-notes-url)
         section-name))
      
      (defun my-lang-words-for-review (section)
        "List the bolded words for review in SECTION."
        (let* ((section (my-lang-tutor-notes section))
               results)
          (with-temp-buffer
            (insert section)
            (org-mode)
            (goto-char (point-min))
            (org-map-entries
             (lambda ()
               (org-end-of-meta-data t)
               (while (re-search-forward "\\*[^* ].*?\\*" nil t)
                 (cl-pushnew
                  (replace-regexp-in-string
                   "[ \n ]+" " "
                   (funcall my-lang-words-for-review-context-function))
                  results
                  :test 'string=)))))
          (nreverse results)))
      

      For example, when I run it on my notes on artificial intelligence, this is the list of bolded words and the sentences that contain them.

      (my-lang-words-for-review "Sur l'intelligence artificielle")
      

      I can then go into the WhisperX transcription JSON file and replay those parts for closer review.

      I can also tweak the context function to give me less information. For example, to limit it to the containing phrase, I can do this:

      (defun my-split-string-keep-delimiters (string delimiter)
        (when string
          (let (results pos)
            (with-temp-buffer
              (insert string)
              (goto-char (point-min))
              (setq pos (point-min))
              (while (re-search-forward delimiter nil t)
                (push (buffer-substring pos (match-beginning 0)) results)
                (setq pos (match-beginning 0)))
              (push (buffer-substring pos (point-max)) results)
              (nreverse results)))))
      
      (ert-deftest my-split-string-keep-delimiters ()
       (should
        (equal (my-split-string-keep-delimiters
                "Beaucoup de gens ont une réaction forte contre l'IA pour plusieurs raisons qui *incluent* le battage médiatique excessif dont elle fait l'objet, son utilisation à mauvais escient, et *l'inondation de banalité* qu'elle produit."
                ", \\| que \\| qui \\| qu'ils? \\| qu'elles? \\| qu'on "
                )
       )))
      
      (defun my-lang-words-for-review-phrase-context (&optional s)
        (setq s (replace-regexp-in-string " " " " (or s (sentence-at-point))))
        (string-join
         (seq-filter (lambda (s) (string-match "\\*" s))
                     (my-split-string-keep-delimiters s ", \\| parce que \\| que \\| qui \\| qu'ils? \\| qu'elles? \\| qu'on \\| pour "))
         " ... "))
      
      (ert-deftest my-lang-words-for-review-phrase-context ()
        (should
         (equal (my-lang-words-for-review-phrase-context
                 "Je peux consacrer une petite partie de mon *budget* à des essais, mais je ne veux pas travailler davantage pour rentabiliser une dépense plus importante.")
                "Je peux consacrer une petite partie de mon *budget* à des essais")))
      
      (let ((my-lang-words-for-review-context-function 'my-lang-words-for-review-phrase-context))
        (my-lang-words-for-review "Sur l'intelligence artificielle"))
      

      Now that I have a function for retrieving the HTML or Org Mode for a section, I can use that to wdiff against my current text to more easily spot wording changes.

      (defun my-lang-tutor-notes-wdiff-org ()
        (interactive)
        (let ((section (org-entry-get (point) "ITEM")))
          (my-wdiff-strings
           (replace-regexp-in-string
            " " " "
            (my-org-subtree-text-without-blocks))
           (replace-regexp-in-string
            " " " "
            (my-lang-tutor-notes section)))))
      

      Related:

      Screenshot:

      2026-03-12_11-28-24.png
      Figure 1: wdiff
      This is part of my Emacs configuration.

      You can e-mail me at sacha@sachachua.com.

    13. 🔗 News Minimalist 🐢 AI startup raises $1 billion to fix hallucinations + 10 more stories rss

      In the last 4 days Gemini read 118955 top news stories. After removing previously covered events, there are 11 articles with a significance score over 5.5.

      [5.9] Yann LeCun's AMI Labs raises $1.03 billion to build AI world models —techcrunch.com(+11)

      Yann LeCun’s new venture, AMI Labs, raised $1.03 billion to develop world models that learn from physical reality, seeking to overcome the reliability limitations of existing large language models.

      Valued at $3.5 billion, the company focuses on Joint Embedding Predictive Architecture to minimize AI hallucinations. Major investors like NVIDIA and Bezos Expeditions funded the round, supporting a high-profile research team operating across Paris, New York, Montreal, and Singapore.

      Although commercial applications may take years, the startup intends to publish its research and release open-source code. Early deployments will be tested through industrial partners, including the healthcare startup Nabla.

      [6.0] Global repercussions emerge as US, Israel, and Iran war expands —npr.org(+952)

      A week of U.S. and Israeli strikes against Iran has killed Supreme Leader Ayatollah Ali Khamenei and neutralized Iran's military, sparking a widening regional conflict and global economic instability.

      Iran responded with retaliatory attacks across the Middle East, striking U.S. bases and oil infrastructure in several Gulf nations. Fighting has spread to Lebanon while oil prices surged past ninety dollars per barrel following the closure of the strategic Strait of Hormuz.

      Global powers including China and Russia have called for de-escalation as diplomatic tensions rise between the U.S. and European allies. Meanwhile, the conflict continues to disrupt energy markets and international trade.

      Highly covered news with significance over 5.5

      [6.4] Ukraine deploys armed robots to combat Russian forces — bbc.com (+3)

      [6.4] Germany becomes the fourth-largest global arms exporter — tagesschau.de (German) (+18)

      [6.1] China exports surge in first two months of the year despite Trump tariffs — bbc.com (+8)

      [5.9] Trump pressures Latin American leaders to reduce China ties — courant.com (+77)

      [5.8] France sends aircraft carrier to protect Strait of Hormuz shipping — smh.com.au (+15)

      [5.8] Apple increases iPhone production in India to 25% — businesstoday.in (+6)

      [5.7] Trump launches Americas Counter Cartel Coalition with Latin American and Caribbean nations — nytimes.com (+9)

      [5.7] Federal pilot program launches flying cars in eight US regions this summer — wired.com [$] (+4)

      [5.8] UK cancer death rates reach historic low — news.sky.com (+4)

      Thanks for reading!

      — Vadim


      You can create a personal RSS feed with premium.


      Powered by beehiiv

    14. 🔗 r/reverseengineering Reverse engineering FORM swim goggles: custom protobuf over BLE, 697 captured API requests, full protocol documented rss
    15. 🔗 r/LocalLLaMA This guy 🤡 rss

      This guy 🤡 | At least T3 Code is open-source/MIT licensed. submitted by /u/xenydactyl
      [link] [comments]
      ---|---

    16. 🔗 r/reverseengineering I've made indent guides plugin for IDA rss
    17. 🔗 r/LocalLLaMA We need a minimum karma rule for commenting and posting rss

      so many slop bots here. it’s becoming a kindergarten for openclaws. bots responding to bots.

      submitted by /u/nomorebuttsplz
      [link] [comments]

    18. 🔗 r/LocalLLaMA How I topped the Open LLM Leaderboard using 2x 4090 GPUs — no weights modified. rss

      How I topped the Open LLM Leaderboard using 2x 4090 GPUs — no weights modified. | Hi LocalLLaMAs, A few years ago, I found that duplicating a specific block of 7 middle layers in Qwen2-72B, without modifying any weights, improved performance across all Open LLM Leaderboard benchmarks and took #1. As of 2026, the top 4 models on that leaderboard are still descendants. The weird finding: single-layer duplication does nothing. Too few layers, nothing. Too many, it gets worse. Only circuit-sized blocks of ~7 layers work. This suggests pretraining carves out discrete functional circuits in the layer stack that only work when preserved whole. The whole thing was developed on 2x RTX 4090s in my basement. I don't write papers any more, so here is a full technical write-up in Blog format for your enjoyment. I'm the same guy who built GLaDOS, and scores a crazy Nvidia GH200 system here on Reddit. \I'm now running current models (GLM-4.7, Qwen3.5, MiniMax M2.5) on this dual GH200 rig (see my other post). Code and new models coming soon, including special RYS versions of Qwen3.5 27B and 35A3B Happy to answer questions. submitted by /u/Reddactor
      [link] [comments]
      ---|---

    19. 🔗 r/Harrogate Improv Session in Harrogate rss

      Improv Session in Harrogate | Hi All, I run improv comedy sessions every couple of weeks in Harrogate. Our next one is next Tuesday (17th March). They are very low pressure, we do some easy group warm ups, followed by games and exercises. Our current sessions are aimed at beginners and improvers so there has never been a better time to try it out. If you have any questions let me know. As a bonus for first time joiners your first session is free. Thanks. submitted by /u/GritstoneBoulderer
      [link] [comments]
      ---|---

    20. 🔗 r/york Hotel Advice rss

      Hi all,

      I'm hoping to book a really nice room for my husband's 40th in October. I've currently booked a suite in the Judges Lodging but have just seen a nice looking deluxe room in Galtres Lodge Hotel.

      Would anyone have a preference here?

      Budget is max £300 a night and would like something as nice as possible given the occasion.

      Thank you in advance. :)

      submitted by /u/Routine_Raisin_3698
      [link] [comments]

    21. 🔗 r/Leeds The scooters are coming... Beryl scheme approved, 100 scooters to be available. rss

      https://democracy.leeds.gov.uk/ieDecisionDetails.aspx?ID=58678

      LCC's Facebook Post with quite a few comments

      Geofencing supposedly in place but it will be interesting to see how this is applied, in some cities they have very definite "no scoot" areas where they just stop working and LCC suggest this, along with speed limiters, will be implemented.

      They mention "control of the e-scooters in defined pedestrianised areas" so presumably they will be allowed on sections like Briggate for example?

      Will the pavements be littered with them and the river / canal their home before long? They suggest they will have to be "docked" like the bikes and not just left in a painted area like in York etc so hopefully this won't be an issue.

      Are our roads, cycle lanes and shared spaces suitable for those small wheels given the state of some of them and helmets only "recommended"?

      Given the illegal scooters pretty much have the keys to the city along with the illegal electric motorbikes pretending to be bicycles will the introduction of the legal scooters make the problem worse as people no longer think "scooter bad" by default?

      Hopefully the pricing isn't as high as the bikes to the point it's as cheap to get a bus, the focus on "cycling infrastructure" makes a bit more sense with this I suppose - will you give them a go?

      submitted by /u/thetapeworm
      [link] [comments]

    22. 🔗 r/york Moving van rentals rss

      Hi there, I’m moving from one side of York to the centre over the next couple of weeks. Does anyone have any good recommendations for small removal companies? Or even just a man with a van for a couple of hours that would be happy to help me move some furniture? It just seems like every quote I get online is ridiculously high or is only by the hour. I’m happy to pay obviously, I’d just like it to be a fixed sum. It’s my first time moving house by myself so am looking for any advice or recommendations. Thank you!

      Edit: Ooh, great ideas from all, thank you. I can rent one, and I do potentially have someone to drive it across York. Just waiting to hear back from them now. Just had a small panic being a 5ft2 woman with some big furniture!

      submitted by /u/WoodpeckerContent
      [link] [comments]

    23. 🔗 r/york Grotesque proposal for a country themed bar rss

      Grotesque proposal for a country themed bar | Does anyone have any intel about where this proposed new venue will be located, and how local residents can contest these plans? From the tasteless AI images shared online yesterday, via a recruitment ad, it looks very similar to the church at the Bottom of Micklegate (Jalou). Either way, it sounds like catnip to Stag and Hen dos. York does not need anymore venues which promote unsafe and excessive drinking levels and fuel episodes of antisocial behaviour and littering, alienating residents from accessing and enjoying THEIR city centre at weekends. The council should be supporting independent business ideas which reflect the city’s culture and heritage, and most importantly show respect for and work alongside residents. Would a venue like this even be proposed somewhere like Edinburgh? Exactly. More class, less fad is what our city should aspire to. York residents have significant disposable income. However we aren’t going to wants to spend our money/ leisure time here if we risk being dragged into breaking up a brawl, being vomited on, or our children and pets stepping in broken glass. submitted by /u/Aggravating-Unit3970
      [link] [comments]
      ---|---

    24. 🔗 HexRaysSA/plugin-repository commits sync repo: +3 plugins, +4 releases rss
      sync repo: +3 plugins, +4 releases
      
      ## New plugins
      - [ApplyCalleeTypeEx](https://github.com/Dump-GUY/ApplyCalleeTypeEx) (1.0.0)
      - [IDAssist](https://github.com/symgraph/IDAssist) (1.0.4)
      - [IDAssistMCP](https://github.com/symgraph/IDAssistMCP) (1.0.3)
      
      ## New releases
      - [IDAGuides](https://github.com/libtero/idaguides): 1.2.0
      
    25. 🔗 r/reverseengineering IronPE - Minimal Windows PE manual loader written in Rust. rss
    26. 🔗 @malcat@infosec.exchange We're happy to announce that [#malcat](https://infosec.exchange/tags/malcat) mastodon

      We're happy to announce that #malcat 0.9.13 is out!

      You'll find a new Apple-silicon MacOS port, two integrated MCP servers (in-GUI +headless) for automated triage and an improved interface:

      https://malcat.fr/blog/0913-is-out-macos-port-mcp-server-and-dark- mode

    27. 🔗 r/Yorkshire Yorkshire pudding is easily the best part of a roast. I don’t think a roast dinner is complete without one. rss
    28. 🔗 r/wiesbaden Wiesbaden macht Wiesbaden-Sachen rss
    29. 🔗 r/LocalLLaMA Qwen 3.5 0.8B - small enough to run on a watch. Cool enough to play DOOM. rss

      Qwen 3.5 0.8B - small enough to run on a watch. Cool enough to play DOOM. | So I went down the rabbit hole of making a VLM agent that actually plays DOOM. The concept is dead simple - take a screenshot from VizDoom, draw a numbered grid on top, send it to a vision model with two tools (shoot and move), the model decides what to do. Repeat. The wild part? It's Qwen 3.5 0.8B - a model that can run on a smartwatch, trained to generate text, but it handles the game surprisingly well. On the basic scenario it actually gets kills. Like, it sees the enemy, picks the right column, and shoots. I was genuinely surprised. On defend_the_center it's trickier - it hits enemies, but doesn't conserve ammo, and by the end it keeps trying to shoot when there's nothing left. But sometimes it outputs stuff like "I see a fireball but I'm not sure if it's an enemy", which is oddly self-aware for 0.8B parameters. The stack is Python + VizDoom + direct HTTP calls to LM Studio. Latency is about 10 seconds per step on an M1-series Mac. Currently trying to fix the ammo conservation - adding a "reason" field to tool calls so the model has to describe what it sees before deciding whether to shoot or not. We'll see how it goes. UPD: It's now open source! GitHub: https://github.com/Felliks/DoomVLM Added deathmatch mode, GPU support, Jupyter notebook - full writeup here: https://www.reddit.com/r/LocalLLaMA/comments/1rrlit7/doomvlm_is_now_open_source_vlm_models_playing_doom/ submitted by /u/MrFelliks
      [link] [comments]
      ---|---

    30. 🔗 r/Yorkshire Sometimes you forget how beautiful Yorkshire actually is. rss

      Sometimes you forget how beautiful Yorkshire actually is. | submitted by /u/Pinkplatabys
      [link] [comments]
      ---|---

    31. 🔗 MetaBrainz He’s the man who made music metadata “free” rss

      Thank you to Giampiero Di Carlo, the editor of Rockol, who gave us permission to repost this article. Originally posted in Italian at:https://musicbiz.rockol.it/news-757360/robert-kaye-1970-2026-scomparso-il- fondatore-di-musicbrainz

      The following English translation is courtesy of Google Translate with some manual edits.

      On February 21, 2026, Robert Kaye, founder and Executive Director of the
      MetaBrainz Foundation, the non-profit organization that supports projects like MusicBrainz and ListenBrainz, passed away. The news was announced a few days later by the MetaBrainz Board, described as an unexpected passing. Reposting this remembrance on Rockol MusicBiz late was intentional: we were friends and he deserves the visibility that the particular nature of the past week would have obscured.

      What we lose

      For those who work with music—from archives to platforms, from collectors to DJ software—Kaye is one of those figures who rarely make the front cover, yet change everything: he built the "silent" infrastructure that allows music to be found, sorted, recognized, and correctly linked over time, without this data remaining imprisoned in proprietary databases. Robert Kaye was a visionary of the free/open source community and the driving force behind the "Brainz" ecosystem. His loss is felt not only by those who compile metadata, but by anyone who uses tools based on that information.

      The reaction of the MetaBrainz community, in the official thread, speaks volumes about the human impact beyond the technical one: for many, he wasn't "just" a founder, but a daily presence within a project that thrives on volunteers, discussions and patience.

      Kaye was an engineer by training (Computer Engineering at Cal Poly) and had worked in companies and projects related to MP3 and music software during the dot-com era. At MetaBrainz, they tell it this way: his work on MP3 and his move to eMusic/FreeAmp was the spark that led him to build MusicBrainz and "fall in love" with open source.

      In 2004, he founded the MetaBrainz Foundation in California as a 501(c)(3), with a clear model: free non-commercial use and seeking financial support from commercial entities that benefit from the data and services.

      MusicBrainz and Beyond

      MusicBrainz is often described as an open music encyclopedia: a community database of artists, releases, and relationships that is the backbone for tagging, cataloging, and software integrations. The MetaBrainz ecosystem has since expanded (into ListenBrainz and other projects) but maintained the core idea: making metadata reusable, interoperable, and verifiable by a community. In practice, Robert Kaye's work is visible everywhere without his name appearing: when software correctly recognizes an artist despite homonyms, when an archive links releases and reissues, when a DJ tags a library consistently, when an app displays credits and discographies with fewer errors.

      MetaBrainz has already clarified that the project continues under the guidance of the Board and the existing structure and that updates on the transition will be shared. This is a very delicate transition: when a founder of an infrastructure passes away, the challenge is not just "keeping the servers running," but maintaining the trust of communities and commercial partners who depend on the collective effort.

      A "visible" founder: style, character, community

      Many tributes in recent days have emphasized a detail that is often crucial in open source projects: the founder's personality as the glue. In a personal recollection, Denny Vrandečić describes him as a "principled", "determined", loud and generous figure, capable of both energy and care—a rare combination in someone who must balance vision, inevitable conflicts within a community and sustainability. This isn't folklore: in community projects "governance" also involves tone, presence and the ability to make things happen without shutting down those who contribute. And we're not talking about a niche project here, but a piece of the music internet that many industries take for granted.

      To honor Robert Kaye today, it's crucial to emphasize that his legacy isn't a product but an operationalized idea: that music data can remain a common good, defensible and improvable, rather than becoming merely a closed commodity. And it's an idea that, in 2026, retains a certain weight.

    32. 🔗 Julia Evans Examples for the tcpdump and dig man pages rss

      Hello! My big takeaway from last month's musings about man pages was that examples in man pages are really great, so I worked on adding (or improving) examples to two of my favourite tools' man pages.

      Here they are:

      the goal: include the most basic examples

      The goal here was really just to give the absolute most basic examples of how to use the tool, for people who use tcpdump or dig infrequently (or have never used it before!) and don't remember how it works.

      So far saying "hey, I want to write an examples section for beginners and infrequent users of this tools" has been working really well. It's easy to explain, I think it makes sense from everything I've heard from users about what they want from a man page, and maintainers seem to find it compelling.

      Thanks to Denis Ovsienko, Guy Harris, Ondřej Surý, and everyone else who reviewed the docs changes, it was a good experience and left me motivated to do a little more work on man pages.

      why improve the man pages?

      I'm interested in working on tools' official documentation right now because:

      • Man pages can actually have close to 100% accurate information! Going through a review process to make sure that the information is actually true has a lot of value.
      • Even with basic questions "what are the most commonly used tcpdump flags", often maintainers are aware of useful features that I'm not! For example I learned by working on these tcpdump examples that if you're saving packets to a file with tcpdump -w out.pcap, it's useful to pass -v to print a live summary of how many packets have been captured so far. That's really useful, I didn't know it, and I don't think I ever would have noticed it on my own.

      It's kind of a weird place for me to be because honestly I always kind of assume documentation is going to be hard to read, and I usually just skip it and read a blog post or Stack Overflow comment or ask a friend instead. But right now I'm feeling optimistic, like maybe the documentation doesn't have to be bad? Maybe it could be just as good as reading a really great blog post, but with the benefit of also being actually correct? I've been using the Django documentation recently, and it's really good! We'll see.

      on avoiding writing the man page language

      The tcpdump project tool's man page is written in the roff language, which is kind of hard to use and that I really did not feel like learning it.

      I handled this by writing a very basic markdown-to-roff script to convert Markdown to roff, using similar conventions to what the man page was already using. I could maybe have just used pandoc, but the output pandoc produced seemed pretty different, so I thought it might be better to write my own script instead. Who knows.

      I did think it was cool to be able to just use an existing Markdown library's ability to parse the Markdown AST and then implement my own code-emitting methods to format things in a way that seemed to make sense in this context.

      man pages are complicated

      I went on a whole rabbit hole learning about the history of roff, how it's evolved since the 70s, and who's working on it today, inspired by learning about the mandoc project that BSD systems (and some Linux systems, and I think Mac OS) use for formatting man pages. I won't say more about that today though, maybe another time.

      In general it seems like there's a technical and cultural divide in how documentation works on BSD and on Linux that I still haven't really understood, but I have been feeling curious about what's going on in the BSD world.

      The comments section is here.