36 points | by jxmorris12 1 day ago
3 comments
David looks into the LLM finds the thinking layers and cut duplicates then and put them back to back.
This increases the LLM scores with basically no over head.
Very interesting read.
But what's in the context window is sharp, the exact text or video frame right in front of them.
The goal is to bring more of the world into that context.
Compression gives it intuition. Context gives it precision.
Imagine if we could extract the model's reasoning core and plug it anywhere we want.
David looks into the LLM finds the thinking layers and cut duplicates then and put them back to back.
This increases the LLM scores with basically no over head.
Very interesting read.
But what's in the context window is sharp, the exact text or video frame right in front of them.
The goal is to bring more of the world into that context.
Compression gives it intuition. Context gives it precision.
Imagine if we could extract the model's reasoning core and plug it anywhere we want.