Now we need processes to gain awareness of the process manager and integrate an LLM into each process to argue with the process manager why it should let them live.
But seriously, it does really bug me on principle that DropBox should use over half a GB simply because it uses Chromium, even when nothing is visible.
For me it's LSP servers taking 2 gigs of RAM. With Antigravity, Google managed to go beyond this, it is totally unusable for me (but other VScode clones work fine, apart from the 2 Go LSP servers).
oh absolutely. burning a coal plant to decide if i should close discord is peak 2025 energy.
strictly speaking, using the local model (Ollama) is 'free' in terms of watts since my laptop is on anyway, but yeah, if the inefficiency is the art, I'm the artist.
An interesting thought experiment - a fully local, off-grid, off-network LLM device. Solar or wind or what have you. I suppose the Mac Studio route is a good option here, I think Apple make the most energy efficient high-memory options. Back of the napkin indicates it’s possible, just a high up front cost. Interesting to imagine a somewhat catastrophe-resilient LLM device…
Macs would be the most power efficient with faster memory but an AI Max 395+ based system would probably be the most cost efficient right now. A Framework Desktop with 128GB of shared RAM only pulls 400W (and could be underclocked) and is cheaper by enough that you could buy it plus 400W of solar panels and a decently large battery for less than a Mac Studio with 128GB of RAM. Unfortunately the power efficiency win is more expensive than just buying more power generation and storage ability.
I suppose in terms of catastrophe resilience repairability would be important, although how do you repair a broken GPU in any case. Probably cold backup machines is probably the more feasible way to extend lifetimes.
And yeah - I was thinking that actually power efficiency isn’t really a massive deal if you have some kind of thin client setup. The LLM nodes can be at millraces or some other power dense locations, and then the clients are basically 5W displays with an RF transceiver and a keyboard…
I think we are moving toward a bilayered compute model:
The Cloud: For massive reasoning.
The Local Edge: A small, resilient model that lives on-device and handles the OS loop, privacy, and immediate context.
BrainKernel is my attempt to prototype that Local Edge layer. Its messy right now, but I think the OS of 2030 will definitely have a local LLM baked into the kernel.
Well, on my Macbook, some of that already exists. In the Shortcuts app you can use the "Use Model" action which offers to run an LLM on apple's cloud, on-device, or other external service (eg ChatGPT). I use this myself already for several actions, like reading emails from my tennis club to put events in my calendar automatically.
Whether or not we'll see it lower down in the system I'm not sure. Honestly I'm not certain of the utility of an autonomous LLM loop in many or most parts of an OS, where (in general) systems have more value the more deterministic they are, but in the user space, who can say.
In any case, I certainly went down a fun rabbit hole thinking about a mesh network of LLM nodes and thin clients in a post-collapse world. In that scenario, I wonder if the utility of LLMs is really worth the complexity versus a kindle-like device with a copy of wikipedia...
OP here. this is a cursed project lol, but i wanted to see: What happens if you replace the OS scheduler with an LLM?
With Groq speed (Llama 3 @ 800t/s), inference is finally fast enough to be in the system loop.
i built this TUI to monitor my process tree. instead of just showing CPU %, it checks the context (parent process, disk I/O) to decide if a process is compiling code or bloatware. It roasts, throttles, or kills based on that.
Its my experiment in "Intelligent Kernels" how they would be. i used Delta Caching to keep overhead low.
You're underselling this as a process manager, it could also be a productivity tool with some prompt changes; Determine procrastination apps: games, non-professional chat, video streaming and kill it.
I was looking for a project which would run an LLM-powered character (like Clippy), who would periodically screenshot my screen and comment on my life choices.
Sadly the only project I've found was for windows OS
you are technically right (the best kind of right). i am running in userspace, so i cant replace the actual thread scheduling logic in Ring 0 without writing a driver and BSODing my machine.
think of this more as a High-Level Governor. The NTOS scheduler decides which thread runs next, but this LLM decides if that process deserves to exist at all.
basically; NTOS tries to be fair to every process. BrainKernel overrides that fairness with judgment. if i suspend a process, i have effectively vetoed the scheduler.
This is a super simplification of the NTOS scheduler. It's not that dumb!
> if i suspend a process, i have effectively vetoed the scheduler.
I mean, I suppose? It's the NTOS scheduler doing the suspension. It's like changing the priority level -- sure, you can do it, but it's generally to your detriment outside of corner cases.
Interesting experiment. Scheduling decisions feel like the place where unpredictability shows up first. Curious how you reason about rollback when the scheduler makes a bad call.
Great point. In a real kernel, non determinism is a bug. Here, it's a feature (or at least, a known hazard).
To answer your question: There is no Ctrl+Z for SIGKILL. Once the LLM decides to kill a process, it's gone.
My reasoning for 'rollback' is actually latency. I built in a 'Roasting Phase' where the agent mocks the process for a few seconds before executing the kill. That delay acts as an optimistic lock it gives me a window to veto the decision if I see it targeting something critical.
If I'm AFK and it kills my IDE? I treat that as the system telling me to touch grass.
I wouldn't call it replacing the scheduler though - more that you've made a scheduler manager.
Scheduler Manager is definitely the more accurate term. Im just the middleman between the chaos and the kernel.
while BrainKernel replies: 'Objection overruled. You have 5 seconds to wrap up before SIGKILL.'
I might actually have to build a 'Process Defense Attorney' agent now. The logs would be hilarious.
But seriously, it does really bug me on principle that DropBox should use over half a GB simply because it uses Chromium, even when nothing is visible.
And yeah - I was thinking that actually power efficiency isn’t really a massive deal if you have some kind of thin client setup. The LLM nodes can be at millraces or some other power dense locations, and then the clients are basically 5W displays with an RF transceiver and a keyboard…
An entertaining thought experiment :)
I think we are moving toward a bilayered compute model: The Cloud: For massive reasoning.
The Local Edge: A small, resilient model that lives on-device and handles the OS loop, privacy, and immediate context.
BrainKernel is my attempt to prototype that Local Edge layer. Its messy right now, but I think the OS of 2030 will definitely have a local LLM baked into the kernel.
Whether or not we'll see it lower down in the system I'm not sure. Honestly I'm not certain of the utility of an autonomous LLM loop in many or most parts of an OS, where (in general) systems have more value the more deterministic they are, but in the user space, who can say.
In any case, I certainly went down a fun rabbit hole thinking about a mesh network of LLM nodes and thin clients in a post-collapse world. In that scenario, I wonder if the utility of LLMs is really worth the complexity versus a kindle-like device with a copy of wikipedia...
Now that’s a cursed take on power efficency
If I burn a billion tons of someone else's coal to make myself a paperclip (and don't have to breathe the outputs) it works out in my favor too.
With Groq speed (Llama 3 @ 800t/s), inference is finally fast enough to be in the system loop.
i built this TUI to monitor my process tree. instead of just showing CPU %, it checks the context (parent process, disk I/O) to decide if a process is compiling code or bloatware. It roasts, throttles, or kills based on that.
Its my experiment in "Intelligent Kernels" how they would be. i used Delta Caching to keep overhead low.
A 'Focus Mode' that doesn't just block URLs but literally murders the process if I open Steam or Civilization VI.
I could probably add a --mode strict flag that swaps the system prompt to be a ruthless productivity coach. 'Oh, you opened Discord? Roast and Kill.'
Thanks for the idea mate!
Sadly the only project I've found was for windows OS
think of this more as a High-Level Governor. The NTOS scheduler decides which thread runs next, but this LLM decides if that process deserves to exist at all.
basically; NTOS tries to be fair to every process. BrainKernel overrides that fairness with judgment. if i suspend a process, i have effectively vetoed the scheduler.
This is a super simplification of the NTOS scheduler. It's not that dumb!
> if i suspend a process, i have effectively vetoed the scheduler.
I mean, I suppose? It's the NTOS scheduler doing the suspension. It's like changing the priority level -- sure, you can do it, but it's generally to your detriment outside of corner cases.
If I'm AFK and it kills my IDE? I treat that as the system telling me to touch grass.