I don't know about avoided, this kind of represents the WTF per minute code quality measurement. When I write WTF as a response to Claude, I would actually love if an Antrhopic engineer would take a look at what mess Claude has created.
We're talking about Claude Code. If you're coding and not writing or thinking in English, the agents and people reading that code will have bigger problems than a regexp missing a swear word :).
Why do you need to do it at the client side? You are leaking so much information on the client side.
And considering the speed of Claude code, if you really want to do on the client side, a few seconds won't be a big deal.
You have a semi expensive process. But you want to keep particular known context out. So a quick and dirty search just in front of the expensive process. So instead of 'figure sentiment (20seconds)'. You have 'quick check sentiment (<1sec)' then do the 'figure sentiment v2 (5seconds)'. Now if it is just pure regex then your analogy would hold up just fine.
I could see me totally making a design choice like that.
It's fast, but it'll miss a ton of cases. This feels like it would be better served by a prompt instruction, or an additional tiny neural network.
And some of the entries are too short and will create false positives. It'll match the word "offset" ("ffs"), for example. EDIT: no it won't, I missed the \b. Still sounds weird to me.
The pattern only matches if both ends are word boundaries. So "diffs" won't match, but "Oh, ffs!" will. It's also why they had to use the pattern "shit(ty|tiest)" instead of just "shit".
i wish that's for their logging/alert. i definitely gauge model's performance by how much those words i type when i'm frustrated in driving claude code.
It's not hard to find them, they are in clear text in the binary, you can search for known ones with grep and find the rest nearby. You could even replace them inplace (but now its configurable).
Random aside: I've seen a 2015 game be accused of AI slop on Steam because it used a similar concept... And mind you, there's probably thousands of games that do this.
First it was punctuation and grammar, then linguistic coherence, and now it's tiny bits of whimsy that are falling victim to AI accusations. Good fucking grief
To me, this is a sign of just how much regular people do not want AI. This is worse than crypto and metaverse before it. Crypto, people could ignore and the dumb ape pictures helped you figure out who to avoid. Metaverse, some folks even still enjoyed VR and AR without the digital real estate bullshit. And neither got shoved down your throat in everyday, mundane things like writing a paper in Word or trying to deal with your auto mechanic.
But AI is causing such visceral reactions that it's bleeding into other areas. People are so averse to AI they don't mind a few false positives.
It's how people resisted CGI back in the day. What people dislike is low quality. There is a loud subset who are really against it on principle like we also have people who insist on analog music but regular people are much more practical but they don't post about this all day on the internet.
No there is a very loud minority of users who are very anti AI that hate on anything that is even remotely connected to AI and let everyone know with false claims. See the game Expedition 33 for example.
The big loss for Anthropic here is how it reveals their product roadmap via feature flags. A big one is their unreleased "assistant mode" with code name kairos.
Just point your agent at this codebase and ask it to find things and you'll find a whole treasure trove of info.
Edit: some other interesting unreleased/hidden features
- The Buddy System: Tamagotchi-style companion creature system with ASCII art sprites
- Undercover mode: Strips ALL Anthropic internal info from commits/PRs for employees on open source contributions
But will this be released as a feature? For me it seems like it's an Anthropic internal tool to secretly contribute to public repositories to test new models etc.
You'll never win this battle, so why waste feelings and energy on it? That's where the internet is headed. There's no magical human verification technology coming to save us.
Even if it is impossible to win, I am still feeling bad about it.
And at this point it is more about how large space will be usable and how much will be bot-controlled wasteland. I prefer spaces important for me to survive.
This is my pet peeve with LLMs, they almost always fails to write like a normal human would. Mentioning logs, or other meta-things which is not at all interesting.
lol that's funny, I have been working seriously [1] on a feature like this after first writing about it jokingly [2] earlier this year.
The joke was the assistant is a cat who is constantly sabotaging you, and you have to take care of it like a gacha pet.
The seriousness though is that actually, disembodied intelligences are weird, so giving them a face and a body and emotions is a natural thing, and we already see that with various AI mascots and characters coming into existence.
ANTI_DISTILLATION_CC
This is Anthropic's anti-distillation defence baked into Claude Code. When enabled, it injects anti_distillation: ['fake_tools'] into every API request, which causes the server to silently slip decoy tool definitions into the model's system prompt. The goal: if someone is scraping Claude Code's API traffic to train a competing model, the poisoned training data makes that distillation attempt less useful.
This is the single worst function in the codebase by every metric:
- 3,167 lines long (the file itself is 5,594 lines)
- 12 levels of nesting at its deepest
- ~486 branch points of cyclomatic complexity
- 12 parameters + an options object with 16 sub-properties
- Defines 21 inner functions and closures
- Handles: agent run loop, SIGINT, rate-limits, AWS auth, MCP lifecycle, plugin install/refresh, worktree bridging, team-lead polling (while(true) inside), control message dispatch (dozens of types), model switching, turn interruption
recovery, and more
Ye I honestly don't understand his comment. Is it bad code writing? Pre 2026? Sure. In 2026. Nope. Is it going to be a headache for some poor person on oncall? Yes. But then again are you "supposed" to go through every single line in 2026? Again no. I hate it. But the world is changing and till the bubble pops this is the new norm
Would be interesting to run this through Malus [1] or literally just Claude Code and get open source Claude Code out of it.
I jest, but in a world where these models have been trained on gigatons of open source I don't even see the moral problem. IANAL, don't actually do this.
The problem is the oauth and their stance on bypassing that. You'd want to use your subscription, and they probably can detect that and ban users. They hold all the power there.
I don’t think that’s a good comparison. There isn’t anything preventing Anthropic from, say, detecting whether the user is using the exact same system prompt and tool definition as Claude Code and call it a day. Will make developing other apps nearly impossible.
It’s a dynamic, subscription based service, not a static asset like a video.
It is a real product. They take real payments and deliver on whats promised.
Not sure if its an attempt to subvert criticism by using satirical language, or if they truly have so little respect for the open source community.
Neat. Coincidently recently I asked Claude about Claude CLI, if it is possible to patch some annoying things (like not being able to expand Ctrl + O more than once, so never be able to see some lines and in general have more control over the context) and it happily proclaimed it is open source and it can do it ... and started doing something. Then I checked a bit and saw, nope, not open source. And by the wording of the TOS, it might brake some sources. But claude said, "no worries", it only break the TOS technically. So by saving that conversation I would have some defense if I would start messing with it, but felt a bit uneasy and stopped the experiment. Also claude came into a loop, but if I would point it at this, it might work I suppose.
I think that you do not need to feel uneasy at all. It is your computer and your memory space that the data is stored and operating in you can do whatever you like to the bits in that space. I would encourage you to continue that experiment.
Well, the thing is, I do not just use my computer, but connect to their computers and I do not like to get banned. I suppose simple UI things like expanding source files won't change a thing, but the more interesting things, editing the context etc. do have that risk, but no idea if they look for it or enforce it. Their side is, if I want to have full control, I need to use the API directly(way more expensive) and what I want to do is basically circumventing it.
Really surprising how many people are downplaying this leak!
"Google and OpenAi have already open sourced their Agents, so this leak isn't that relevant " What Google and OpenAi have open sourced is their Agents SDK, a toolkit, not the secret sauce of how their flagship agents are wired under the hood!
expect the takedown hammer on the tweet, the R2 link, and any public repos soon
It's exactly the same as the open source codex/gemini and other clis like opencode. There is no secret sauce in the claude cli, and the agent harness itself is no better (worse IMO) than the others. The only thing interesting about this leak is that it may contain unreleased features/flags that are not public yet and hint at what Anthropic is working on.
Anthropic team does an excellent job of speeding up Claude Code when it slows down, but for the sake of RAM and system resources, it would be nice to see it rewritten in a more performant framework!
Well, Claude does boast an absolutely cursed (and very buggy) React-based TUI renderer that I think the others lack! What if someone steals it and builds their own buggy TUI app?
They can't. AI generated code cannot be copyrighted. They've stated that claude code is built with claude code. You can take this and start your own claude code project now if you like. There's zero copyright protection on this.
I'm sure it's not _entirely_ built that way, and in practically speaking GitHub will almost certainly take it down rather than doing some kind of deep research about which code is which.
That's fine. File a false claim DMCA and that's felony perjury :) They know for a fact that there is no copyright on AI generated code, the courts have affirmed this repeatedly.
Very easily these days, even if minified is difficult for me to reverse engineer... Claude has a very easy time of finding exactly what to patch to fix something
Not really, except that they have a bunch of weird things in the source code and people like to make fun of it. OpenCode/Codex generally doesn't have this since these are open-source projects from the get go.
They do have a couple of interesting features that has not been publicly heard of yet:
Like KAIROS which seems to be like an inbuilt ai assistant and Ultraplan which seems to enable remote planning workflows, where a separate environment explores a problem, generates a plan, and then pauses for user approval before execution.
Gemini CLI and Codex are open source anyway. I doubt there was much of a moat there anyway. The cool kids are using things like https://pi.dev/ anyway.
Copilot on OAI reveals everything meaningful about its functionality if you use a custom model config via the API. All you need to do is inspect the logs to see the prompts they're using. So far no one seems to care about this "loophole". Presumably, because the only thing that matters is for you to consume as many tokens per unit time as possible.
The source code of the slot machine is not relevant to the casino manager. He only cares that the customer is using it.
Original llama models leaked from meta. Instead of fighting it they decided to publish them officially. Real boost to the OS/OW models movement, they have been leading it for a while after that.
It would be interesting to see that same thing with CC, but I doubt it'll ever happen.
Are there any interesting/uniq features present in it that are not in the alternatives? My understanding is that its just a client for the powerful llm
From the directory listing having a cost-tracker.ts, upstreamproxy, coordinator, buddy and a full vim directory, it doesn't look like just an API client to me.
It really doesn’t matter anymore. I’m saying this as a person who used to care about it. It does what it’s generally supposed to do, it has users. Two things that matter at this day and age.
It may be economically effective but such heartless, buggy software is a drain to use. I care about that delta, and yes this can be extrapolated to other industries.
Genuinely I have no idea what you mean by buggy. Sure there are some problems here and there, but my personal threshold for “buggy” is much higher. I guess, for a lot of other people as well, given the uptake and usage.
Two weeks ago typing became super laggy. It was totally unusable.
Last week I had to reinstall Claude Desktop because every time I opened it, it just hung.
This week I am sometimes opening it and getting a blank screen. It eventually works after I open it a few times.
And of course there's people complaining that somehow they're blowing their 5 hour token budget in 5 messages.
It's really buggy.
There's only so long their model will be their advantage before they all become very similar, and then the difference will be how reliable the tools are.
Right now the Claude Code code quality seems extremely low.
This is the dumbest take there is about vibe coding. Claiming that managing complexity in a codebase doesn't matter anymore. I can't imagine that a competent engineer would come to the conclusion that managing complexity doesn't matter anymore. There is actually some evidence that coding agents struggle the same way humans do as the complexity of the system increases [0].
I agree, there is obviously “complete burning trash” and there’s this. Ant team has got a system going on for them where they can still extend the codebase. When time comes to it, I’m assuming they would be able to rewrite as feature set would be more solid and assuming they’ve been adding tests as well.
Reverse-engineering through tests have never been easier, which could collapse the complexity and clean the code.
Users stick around on inertia until a failure costs them money or face. A leaked map file won't sink a tool on its own, but it does strip away the story that you can ship sloppy JS build output into prod and still ask people to trust your security model.
'It works' is a low bar. If that's the bar you set you are one bad incident away from finding out who stayed for the product and who stayed because switching felt annoying.
“It works and it’s doing what it’s supposed to do” encompasses the idea that it’s also not doing what it’s not supposed to do.
Also “one bad incident away” never works in practice. The last two decades have shown how people will use the tools that get the job done no matter what kinda privacy leaks, destructive things they have done to the user.
Team has been extremely open how it has been vibe coded from day 1. Given the insane amount of releases, I don’t think it would be possible without it.
It’s not a particularly sophisticated tool. I’d put my money on one experienced engineer being able to achieve the same functionality in 3-6 months (even without the vibe coding).
I don't really care about the code being an unmaintainable mess, but as a user there are some odd choices in the flow which feel could benefit from human judgement
I’m not strongly opinionated, especially with such a short function, but in general early return makes it so you don’t need to keep the whole function body in your head to understand the logic. Often it saves you having to read the whole function body too.
But you can achieve a similar effect by keeping your functions small, in which case I think both styles are roughly equivalent.
useCanUseTool.tsx looks special, maybe it'scodegen'ed or copy 'n pasted? `_c` as an import name, no comments, use of promises instead of async function. Or maybe it's just bad vibing...
Maybe, I do suspect _some_ parts are codegen or source map artifacts.
But if you take a look at the other file, for example `useTypeahead` you'd see, even if there are a few code-gen / source-map artifacts, you still see the core logic, and behavior, is just a big bowl of soup
1. Randomly peeking at process.argv and process.env all around. Other weird layering violations, too.
2. Tons of repeat code, eg. multiple ad-hoc implementations of hash functions / PRNGs.
3. Almost no high-level comments about structure - I assume all that lives in some CLAUDE.md instead.
That's exactly why, access to global mutable state should be limited to as small a surface area as possible, so 99% of code can be locally deterministic and side-effect free, only using values that are passed into it. That makes testing easier too.
environment variables can change while the process is running and are not memory safe (though I suspect node tries to wrap it with a lock). Meaning if you check a variable at point A, enter a branch and check it again at point B ... it's not guaranteed that they will be the same value. This can cause you to enter "impossible conditions".
It's implicit state that's also untyped - it's just a String -> String map without any canonical single source of truth about what environment variables are consulted, when, why and in what form.
Such state should be strongly typed, have a canonical source of truth (which can then be also reused to document environment variables that the code supports, and eg. allow reading the same options from configs, flags, etc) and then explicitly passed to the functions that need it, eg. as function arguments or members of an associated instance.
This makes it easier to reason about the code (the caller will know that some module changes its functionality based on some state variable). It also makes it easier to test (both from the mechanical point of view of having to set environment variables which is gnarly, and from the point of view of once again knowing that the code changes its behaviour based on some state/option and both cases should probably be tested).
Code quality no longer carries the same weight as it did pre LLMs. It used to matter becuase humans were the ones reading/writing it so you had to optimize for readability and maintainability. But these days what matters is the AI can work with it and you can reliably test it. Obviously you don’t want code quality to go totally down the drain, but there is a fine balance.
Optimize for consistency and a well thought out architecture, but let the gnarly looking function remain a gnarly function until it breaks and has to be refactored. Treat the functions as black boxes.
Personally the only time I open my IDE to look at code, it’s because I’m looking at something mission critical or very nuanced. For the remainder I trust my agent to deliver acceptable results.
Wow it's true. Anthropic actually had me fooled. I saw the GitHub repository and just assumed it was open source. Didn't look at the actual files too closely. There's pretty much nothing there.
So glad I took the time to firejail this thing before running it.
LLMs are good in JS and Python which means everything from now on will be written in or ported to either of those two languages.
So yeah, JS is the future of all software.
It shows that a company you and your organization are trusting with your data, and allowing full control over your devices 24/7, is failing to properly secure its own software.
It is a client running on an interpreted language your own computer, there is nothing to secure or hide as source was provided to you already or am I mistaking?
Can we stop referring to source maps as leaks? It was packaged in a way that wasn’t even obfuscated. Same as websites - it’s not a “leak” that you can read or inspect the source code.
Maybe the OP could clarify, I don't like reading leaked code, but I'm curious:
my understanding is that is it the source code for "claude code", the coding assistant that remotely calls the LLMs.
Is that correct ? The weights of the LLMs are _not_ in this repo, right ?
It sure sucks for anthropic to get pawned like this, but it should not affect their bottom line much ?
I guess these words are to be avoided...
Regex is going to be something like 10,000 times quicker than the quickest LLM call, multiply that by billions of prompts
it is not that slow
Thanks
You have a semi expensive process. But you want to keep particular known context out. So a quick and dirty search just in front of the expensive process. So instead of 'figure sentiment (20seconds)'. You have 'quick check sentiment (<1sec)' then do the 'figure sentiment v2 (5seconds)'. Now if it is just pure regex then your analogy would hold up just fine.
I could see me totally making a design choice like that.
This has buttbuttin energy. Welcome to the 80s I guess.
I've seen Claude Code went with a regex approach for a similar sentiment-related task.
And some of the entries are too short and will create false positives. It'll match the word "offset" ("ffs"), for example. EDIT: no it won't, I missed the \b. Still sounds weird to me.
You could always tell when a sysadmin started hacking up some software by the if-else nesting chains.
First it was punctuation and grammar, then linguistic coherence, and now it's tiny bits of whimsy that are falling victim to AI accusations. Good fucking grief
But AI is causing such visceral reactions that it's bleeding into other areas. People are so averse to AI they don't mind a few false positives.
Just point your agent at this codebase and ask it to find things and you'll find a whole treasure trove of info.
Edit: some other interesting unreleased/hidden features
- The Buddy System: Tamagotchi-style companion creature system with ASCII art sprites
- Undercover mode: Strips ALL Anthropic internal info from commits/PRs for employees on open source contributions
https://github.com/chatgptprojects/claude-code/blob/642c7f94...
EDIT: I just realized this might be used without publishing the changes, for internal evaluation only as you mentioned. That would be a lot better.
And at this point it is more about how large space will be usable and how much will be bot-controlled wasteland. I prefer spaces important for me to survive.
Except for the one Sam Altman is building.
The undercover mode prompt was generated using AI.
Buddy system is this year's April Fool's joke, you roll your own gacha pet that you get to keep. There are legendary pulls.
They expect it to go viral on Twitter so they are staggering the reveals.
The joke was the assistant is a cat who is constantly sabotaging you, and you have to take care of it like a gacha pet.
The seriousness though is that actually, disembodied intelligences are weird, so giving them a face and a body and emotions is a natural thing, and we already see that with various AI mascots and characters coming into existence.
[1]: serious: https://github.com/mech-lang/mech/releases/tag/v0.3.1-beta
[2]: joke: https://github.com/cmontella/purrtran
This is the single worst function in the codebase by every metric:
This should be at minimum 8–10 separate modules.If it's entirely generated / consumed / edited by an LLM, arguably the most important metric is... test coverage, and that's it ?
I jest, but in a world where these models have been trained on gigatons of open source I don't even see the moral problem. IANAL, don't actually do this.
https://malus.sh/
The real value here will be in using other cheap models with the cc harness.
It’s a dynamic, subscription based service, not a static asset like a video.
So not even close to Opus, then?
These are a year behind, if not more. And they're probably clunky to use.
“Let's end open source together with this one simple trick”
https://pretalx.fosdem.org/fosdem-2026/talk/SUVS7G/feedback/
Malus is translating code into text, and from text back into code.
It gives the illusion of clean room implementation that some companies abuse.
The irony is that ChatGPT/Claude answers are all actually directly derived from open-source code, so...
https://www.youtube.com/watch?v=6godSEVvcmU
Who'd have thought, the audience who doesn't want to give back to the opensource community, giving 0 contributions...
https://www.youtube.com/watch?v=6godSEVvcmU
They won't even read your defence.
https://news.ycombinator.com/item?id=47582220
And now, with Claude on a Ralph loop, you can.
this one has more stars and more popular
"Don't blow your cover"
Interesting to see them be so informal and use an idiom to a computer.
And using capitals for emphasis.
Not exactly this, but close.
I hope it's a common knowledge that _any_ client side JavaScript is exposed to everyone. Perhaps minimized, but still easily reverse-engineerable.
There were/are a lot of discussions on how the harness can affect the output.
(I work on OpenCode)
Surely there's nothing here of value compared to the weights except for UX and orchestration?
Couldn't this have just been decompiled anyhow?
They could have written that in curl+bash that would not have changed much.
Like KAIROS which seems to be like an inbuilt ai assistant and Ultraplan which seems to enable remote planning workflows, where a separate environment explores a problem, generates a plan, and then pauses for user approval before execution.
Copilot on OAI reveals everything meaningful about its functionality if you use a custom model config via the API. All you need to do is inspect the logs to see the prompts they're using. So far no one seems to care about this "loophole". Presumably, because the only thing that matters is for you to consume as many tokens per unit time as possible.
The source code of the slot machine is not relevant to the casino manager. He only cares that the customer is using it.
Famously code leaks/reverse engineering attempts of slot machines matter enormously to casino managers
[0] -https://en.wikipedia.org/wiki/Ronald_Dale_Harris#:~:text=Ron...
[1] - https://cybernews.com/news/software-glitch-loses-casino-mill...
[2] - https://sccgmanagement.com/sccg-news/2025/9/24/superbet-pays...
Original llama models leaked from meta. Instead of fighting it they decided to publish them officially. Real boost to the OS/OW models movement, they have been leading it for a while after that.
It would be interesting to see that same thing with CC, but I doubt it'll ever happen.
> current: 2.1.88 · latest: 2.1.87
Which makes me think they pulled it - although it still shows up as 2.1.88 on npmjs for now (cached?).
https://github.com/oboard/claude-code-rev
Last week I had to reinstall Claude Desktop because every time I opened it, it just hung.
This week I am sometimes opening it and getting a blank screen. It eventually works after I open it a few times.
And of course there's people complaining that somehow they're blowing their 5 hour token budget in 5 messages.
It's really buggy.
There's only so long their model will be their advantage before they all become very similar, and then the difference will be how reliable the tools are.
Right now the Claude Code code quality seems extremely low.
[0] https://arxiv.org/abs/2603.24755
Reverse-engineering through tests have never been easier, which could collapse the complexity and clean the code.
'It works' is a low bar. If that's the bar you set you are one bad incident away from finding out who stayed for the product and who stayed because switching felt annoying.
Also “one bad incident away” never works in practice. The last two decades have shown how people will use the tools that get the job done no matter what kinda privacy leaks, destructive things they have done to the user.
It's extremely nested, it's basically an if statement soup
`useTypeahead.tsx` is even worse, extremely nested, a ton of "if else" statements, I doubt you'd look at it and think this is sane code
Do you care to elaborate? "if (...) return ...;" looks closer to an expression for me:
But you can achieve a similar effect by keeping your functions small, in which case I think both styles are roughly equivalent.
What is the problem with that? How would you write that snippet? It is common in the new functional js landscape, even if it is pass-by-ref.
But if you take a look at the other file, for example `useTypeahead` you'd see, even if there are a few code-gen / source-map artifacts, you still see the core logic, and behavior, is just a big bowl of soup
That's exactly why, access to global mutable state should be limited to as small a surface area as possible, so 99% of code can be locally deterministic and side-effect free, only using values that are passed into it. That makes testing easier too.
Such state should be strongly typed, have a canonical source of truth (which can then be also reused to document environment variables that the code supports, and eg. allow reading the same options from configs, flags, etc) and then explicitly passed to the functions that need it, eg. as function arguments or members of an associated instance.
This makes it easier to reason about the code (the caller will know that some module changes its functionality based on some state variable). It also makes it easier to test (both from the mechanical point of view of having to set environment variables which is gnarly, and from the point of view of once again knowing that the code changes its behaviour based on some state/option and both cases should probably be tested).
Optimize for consistency and a well thought out architecture, but let the gnarly looking function remain a gnarly function until it breaks and has to be refactored. Treat the functions as black boxes.
Personally the only time I open my IDE to look at code, it’s because I’m looking at something mission critical or very nuanced. For the remainder I trust my agent to deliver acceptable results.
Or is there an open source front-end and a closed backend?
No, its not even source available,.
> Or is there an open source front-end and a closed backend?
No, its all proprietary. None of it is open source.
So glad I took the time to firejail this thing before running it.
https://github.com/openai/codex
[1] https://www.amazon.com/Programming-TypeScript-Making-JavaScr...
Language servers, however, are a pain on Claude code. https://github.com/anthropics/claude-code/issues/15619
But a lot of desktop tools are written in JS because it's easy to create multi-platform applications.
Why weren't proper checks in place in the first place?
Bonus: why didn't they setup their own AI-assisted tools to harness the release checks?
It's a wake up call.
Is that correct ? The weights of the LLMs are _not_ in this repo, right ?
It sure sucks for anthropic to get pawned like this, but it should not affect their bottom line much ?
This code hasn't been open source until now and contains information like the system prompts, internal feature flags, etc.
Don't worry about that, the code in that repository isn't Anthropic's to begin with.