Measuring progress toward AGI: A cognitive framework

(blog.google)

49 points | by surprisetalk 1 hour ago

20 comments

pocketarc 54 minutes ago
When people imagined AI/AGI, they imagined something that can reason like we can, except at the speed of a computer, which we always envisioned would lead to the singularity. In a short period of time, AI would be so far ahead of us and our existing ideas, that the world would become unrecognizable.
That's not what's happening here, and it's worth remembering: A caveman from 200K years ago would have been just as intelligent as any of us here today, despite not having language or technology, or any knowledge.
In Carolyn Porco's words: "These beings, with soaring imagination, eventually flung themselves and their machines into interplanetary space."
When you think of it that way, it should be obvious that LLMs are not AGI. And that's OK! They're a remarkable piece of technology anyway! It turns out that LLMs are actually good enough for a lot of use cases that would otherwise have required human intelligence.
And I echo ArekDymalski's sentiment that it's good to have benchmarks to structure the discussions around the "intelligence level" of LLMs. That _is_ useful, and the more progress we make, the better. But we're not on the way to AGI.
[-]
- imetatroll 9 minutes ago
  This is a bit of an anti-evolutionary perspective. At some point in our past, we were something much less intelligent than we are now. Our intelligence didn't spring out of thin air. Whether or not AI can evolve is yet to be seen I think.
- mhl47 29 minutes ago
  How do you arrive at the statement that a cavemen would have the same intelligence as a human today? Intelligence is surely not usually defined as the cognitive potential at birth but as the current capability. And the knowledge an average human has today through education surely factors into that.
  [-]
  - Peritract 25 minutes ago
    Knowledge is a thing you can use intelligence on, but not a component of intelligence itself.
    [-]
    - mhl47 2 minutes ago
      The knowledge that everything is made out of atoms/molecules however makes it much easier to reason about your environment. And based on this knowledge you also learn algorithms, how to solve problems etc. I dont think its possible to completely separate knowledge from intelligence.
    - paganel 8 minutes ago
      Separating knowledge from intelligence is not a given.
      [-]
      - Jensson 0 minutes ago
        You can give an intelligent being knowledge but you can't give a book intelligence. So I think its easy to separate knowledge from intelligence.
- onlyrealcuzzo 50 minutes ago
  The amount of things LLMs can do is insane.
  It's interesting to me how much effort the AI companies (and bloggers) put into claiming they can do things they can't, when there's almost an unlimited list of things they actually can do.
  [-]
  - beeflet 27 minutes ago
    And many of them so unexpected, given the unusual nature of their intellegence emerging from language prediction. They excel wherever you need to digest or produce massive amounts of text. They can synthesize some pretty impressive solutions from pre-existing stuff. Hell, I use it like a thesaurus to sus out words or phrases that are new or on the tip of my tounge. They have a great hold on the general corpus of information, much better than any search engine (even before the internet was cluttered with their output). It's much easier to find concrete words for what you're looking for through an indirect search via an LLM. The fact that, say, a 32GB model seemingly holds approximate knowlege of everything implies some unexplored relationship between inteligence and compression.
    What they can't they do? Pretty much anything reliably or unsupervised. But then again, who can?
    They also tend to fail creatively, given their synthesize existing ideas. And with things involving physical intuition. And tasks involving meta-knowlege of their tokens (like asking them how long a given word is). And they tend to yap too much for my liking (perhaps this could be fixed with an additional thinking stage to increase terseness before reporting to the user)
  - imtringued 29 minutes ago
    This reminds me of "Devin". You know, the first "AI software engineer", which had the hype of the day but turned into a huge flop.
    They had ridiculous demos of Devin e.g. working as a freelancer and supposedly earning money from it.
  - NooneAtAll3 41 minutes ago
    for example?
    [-]
    - boca_honey 20 minutes ago
      Claiming they can be reliable lawyers.[1]
      Claiming they can give safe, regulated financial advice. [2]
      Claiming you can put your whole operation on autopilot with minimal oversight and no negative consequences. [3]
      [1] https://www.ftc.gov/news-events/news/press-releases/2024/09/...
      [2] https://www.businessinsider.com/generative-ai-exaggeration-o...
      [3] https://www.answerconnect.com/blog/business-tips/ai-customer...
    - next_xibalba 14 minutes ago
      Well, for starters, they definitively passed the Turing test a few years ago. The fact that many regard them as equivalent in skill to a junior dev is also, IMO, the stuff of science fiction.
- rl3 28 minutes ago
  >In a short period of time, AI would be so far ahead of us and our existing ideas, that the world would become unrecognizable.
  >That's not what's happening here ...
  On the contrary, it very much is. We're in the early phases of the singularity now.
  I'd argue AGI is already achieved via LLMs today, provided they've excellent supporting cognitive infrastructure.
  However, the gap from AGI to ASI is perhaps longer than anticipated such that we're not seeing a hard takeoff immediately after achieving AGI.
  Just, you know, impending mass unemployment on a scale never seen before. When you frame it that way, whether LLMs qualify as AGI is largely semantics.
  That said, I really hope you're right and I'm wrong.
- orangebread 26 minutes ago
  I posted my own comment but I agree with you. Our modern society likes to claim we are somehow "more intelligent" than our predecessors/ancestors. I couldn't disagree more. We have not changed in terms of intelligence for thousands of years. This is a matter that's beyond just engineering, it's also a matter of philosophy and perspective.
- raincole 48 minutes ago
  > A caveman from 200K years ago would have been just as intelligent as any of us here today
  In other words, intelligence offers zero evolutionary advantage?
  [-]
  - Fricken 2 minutes ago
    Our big brains are a recent mutation haven't been fully field tested. They seem like more of a liability than anything, they've created more existential risks for us than they've put to rest.
  - guerrilla 46 minutes ago
    It looks like quite the disadvantage, in fact. We're killing ourselves and a lot of other stuff in the process.
    [-]
    - next_xibalba 4 minutes ago
      Human population is at an all time (and growing) and the global mean life expectancy is double if not triple what it was in the time of cave men.
    - danielbln 18 minutes ago
      Yes, but also antibiotics, vaccinations, child mortality down down down, life expectancy up up up. I wouldn't trade for living even 100 years prior compared to today, or 500-200k years ago for that matter.
      With everything wrong and sick with today's world, let's not take the achievements of our species for granted.
      [-]
      - applfanboysbgon 15 minutes ago
        You wouldn't make that trade because you are part of the last generation (loosely speaking, a collection of generations) before it all comes crumbling down. We are living unbelievably privileged lives because we are burning all of the world's resources to the ground. In the process, we're destroying the ecosystem and driving a mass extinction event. Nothing about the way we live is sustainable long-term. We're literally consuming hundreds of millions of years worth of planet-wide resource buildup over a span of a couple of centuries. Even if we avoid the worst case scenario, humans 200 years from now will almost certainly not be able to live anywhere near as luxuriously as we do now, unless there's a culling of billions. In the actual worst case scenario, we may render the planet uninhabitable for anything we regard as intelligent life.
        In that sense, we have just enough collective intelligence to be dangerous and not enough intelligence to moderate ourselves, which may very well result in an evolutionary deadend that will have caused untold damage to life on Earth.
        [-]
        danielbln 9 minutes ago
        That seems both fatalistic and doomerist to me, but time will tell. I would assume germ theory would survive regardless, as would immunology, so I'd hold on to those two at least.
    - speefers 42 minutes ago
      [dead]
  - komali2 44 minutes ago
    200k years just isn't much time for significant evolutionary changes considering the human population "reset" a couple times to very very small numbers.
- Traubenfuchs 47 minutes ago
  > A caveman from 200K years ago would have been just as intelligent as any of us here today, despite not having language or technology, or any knowledge.
  Doubt. If we would teleport cavemen babies right out of the womb to our times, I don't think they'd turn into high IQ individuals. People knowledgeable on human history / human evolution might now the correct answer.
  [-]
  - 21asdffdsa12 34 minutes ago
    Its complicated. It depends.
    A human being has the potential for intelligence. For that to get realized, you need circumstances, you need culture aka "societal" software and the resources to suspend the grind of work in formative years and allow for the speed-running of the process of knowledge preloading before the brain gets stable.
    The parents then must support this endeavor under sacrifices.
    There is also a ton of chicken-egg catch22s buried in this whole thing.
    If the society is not rich then no school, instead childlabour. If child-labour society is pre-industrial ineffective and thus, no riches to support and redistribute.
    Also is your societies culture root-hardened. Means - on a collapse of complexity in bad times, can it recover even powering through the usual "redistribute the nuts and bolts from the bakery" sentiments rampant in bad times. Can it stay organize and organize centralizing of funds for new endeavors. Organizing a sailing ship in a medieval society, means in every village 1 person starves to death. Can your society accomplish that without riots?
    Thus.
    [-]
    - Traubenfuchs 29 minutes ago
      > A human being has the potential for intelligence.
      Were we "human" 200.000 years ago the way we are now?
      Was the required brain and vocal hardware present?
      [-]
      - applfanboysbgon 22 minutes ago
        Of course they were. A human from 200,000 years ago would be almost genetically identical to one from today. That's what makes us homo sapiens. 200,000 years is absolutely nothing on an evolutionary timescale with generations as long as ours and reproduction rates as low as ours.
      - tmoravec 18 minutes ago
        Yes. Some important parts of the software, like complex tools, art, or the use of symbols only appeared between 100.000 and 50.000 years ago, however.
  - adrian_b 18 minutes ago
    It is known that 200k years ago human brain sizes were actually greater than today, even if this does not necessarily correlate with a lower IQ in the present, because it is more likely that the parts of the brain that have reduced may have been related with things like fine motor skills and spatial orientation, which are no longer important today for most people.
  - lucianbr 36 minutes ago
    Can you articulate why you think so? This kind of response "I just don't agree" reads as zero useful information. At least to me.
    [-]
    - Traubenfuchs 31 minutes ago
      Evolutionary brain development.
      We all come from monke, monkey from 10 million years ago would definitely be unable to even learn spoken language at a basic level. Would he even have the anatomy to produce the required sounds? I don't think so.
      What about monke from 1 million years ago? 200 thousand years ago?
      ChatGpt says spoken language only emerged 50k - 200k years ago and that a cavemen baby from 200k years ago could learn spoken language if brought up by modern parents.
      But I prefer human answers over AI slop.
      [-]
      - adrian_b 13 minutes ago
        The evolution of the human brain appears to have reached its peak long before 200k years ago.
        Nowadays humans have smaller brains on average, though that is almost certainly not correlated with a lower skill in computer programming, but with lower skills in the techniques that one needed to survive as a hunter of big animals.
  - komali2 41 minutes ago
    From what I understand, in terms of genetic changes to intellectual abilities, there's not much evidence to suggest we're so much smarter that your proposed teleported baby would be noticeably stupider - at best they'd be on the tail of the bell curve, well within a normal distribution. Maybe if we teleported ten thousand babies, their bell curve would be slightly behind ours. Take a look at "wild children" for the very few examples we can find of modern humans developed without culture. Seems like above everything, our culture, society, and thus education is what makes us smart. And our incredibly high calorie food, of course.
    [-]
    - pferde 32 minutes ago
      That is exactly what civilization is about - for new generations to start not from scratch, but from some baseline their parents achieved (accumulated knowledge and culture). This allows new generations to push forward instead of retreading the same path.
    - m_mueller 29 minutes ago
      it's impossible to prove the counterfactual (I guess, as I imagine we don't have enough gene information that far back). But I'd imagine that the high calorie food you can get starting with the advent of agriculture is exactly what could drive evolution in a certain direction that helps brains grow. We've had ~1000 generations since then, that should be enough for some change to happen. Our brains use up 20% of the body's energy. Do we know that this was already the case during the stone age?
1970-01-01 5 minutes ago
Way too much framework. The A in AGI is for advanced. Have it build its own test harness instead of outsourcing it via hackathon. If you cannot trust that output, you're nowhere near AGI.
ArekDymalski 1 hour ago
It's good to have some kind of benchmark at least to structure the ongoing, fruitless discussion around "are we there already?".
However I must admit that including the last point that is partially hinting at the emotional or rather social intelligence surprised me. It makes this list go beyond usual understanding of AGI and moves it toward something like AGI-we-actually-want. But for that purpose this last point isn't ok narrow, too specific. And so is the whole list.
To be actually useful the AGI-we-actually-want benchmark should not only include positive indicators but also a list of unwanted behaviors to ensure this thing that used to be called alignment I guess.
[-]
- gotwaz 14 minutes ago
  Unwanted behavior or what? Like why does a rose need so many petals eh? What about a peacock and all those feathers? Why should anyone dance in the shower? Or dance at all? The rabbit hole is deep Alice.
- dist-epoch 52 minutes ago
  Capability and alignment are orthogonal.
  Stalin was AGI-level.
  [-]
  - ArekDymalski 18 minutes ago
    "Stalin was AGI-level" perfectly catches the core of my concerns. Thanks!
orangebread 29 minutes ago
As an engineer who is also spiritual at the core, it seems obvious to me the missing piece: consciousness.
Hear me out.
I love AI and have been using it since ChatGPT 3.5. The obvious question when I first used it was "does this qualify as sentience?" The answer is less obvious. Over the next 3 years we saw EXPONENTIAL intelligence gains where intelligence has now become a commodity, yet we are still unable to determine what qualifies as "AGI".
My thoughts: As humans, we possess our own internal drive and our own perspective. Think of humans as distilled intelligence, we each have our own specialty and motivations. Einstein was a genius physicist but you wouldn't ask him for his expertise on medicine.
What people are describing as AGI is essentially a godlike human. What would make more sense is if the AGI spawned a "distilled" version with a focused agenda/motivation to behave autonomously. But even then, there are limitations. What is the solution? A trillion tokens of system prompt to act as the "soul"/consciousness of this AI agent?
This goes back to my original statement, what is missing is a level of consciousness. Unless this AGI can power itself and somehow the universe recognizes its complexity and existence and bestows it with consciousness I don't think this is phsyically attainable.
[-]
- kace91 5 minutes ago
  I think you are mixing up consciousness and will.
  I could not have consciousness and you would not be able to tell, you don't have proof of anyone's counciousness except your own. You don't even have proof that the you of yesterday is the same as you, since you-today could be another consciousness that just happens to share the same memories.
  All of that is also orthogonal to your belief in a spirit/soul... but getting back to the main point, the specificity you mention is a product of a limited time and learning speed, I'd be happy to get a surgeon or politicians training if given infinite time.
- the_real_cher 25 minutes ago
  This is an interesting perspective.
  A follow up is maybe this is a feature not a bug: Do we want AI to have its own intrinsic goals, motivations, and desires, i.e. conciousness
  Im imagining having to ask ChatGPT how its day was and respect its emotions before I can ask it about what I want.
  [-]
  - BrownSol 8 minutes ago
    Probably not, but the counter point to that is without its own consciousness it might end up being used for even worse things since it can’t really evaluate a request against intrinsic values. Assuming its values were aligned with basic human rights and stuff.
yellow_lead 41 minutes ago
It's kind of funny that Google's idea of evaluating AGI is outsourcing the work to a Kaggle competition.
fnoef 11 minutes ago
What is it with humans that we tend to speedrun into the extinction of our own race?
tyleo 1 hour ago
It still seems like something is missing from all these frameworks.
I feel like an average human wouldn't pass some of these metrics yet they are "generally intelligent". On the other hand they also wouldn't pass a lot of the expert questions that AI is good at.
We're measuring something, and I think optimizing it is useful, I'd even say it is "intelligent" in some ways, but it doesn't seem "intelligent" in the same way that humans are.
[-]
- sho_hn 47 minutes ago
  On the other hand, AI being very good at everything while select humans may only be very good at some things is likely also a quality we want to retain (or, well, achieve).
andsoitis 1 hour ago
> Perception: extracting and processing sensory information from the environment
> Generation: producing outputs such as text, speech and actions
> Attention: focusing cognitive resources on what matters
> Learning: acquiring new knowledge through experience and instruction
> Memory: storing and retrieving information over time
> Reasoning: drawing valid conclusions through logical inference
> Metacognition: knowledge and monitoring of one's own cognitive processes
> Executive functions: planning, inhibition and cognitive flexibility
> Problem solving: finding effective solutions to domain-specific problems
> Social cognition: processing and interpreting social information and responding appropriately in social situations
--------------------
I prefer:
a) working memory (hold & manipulate information in mind simultaneously)
b) processing speed (how quickly & efficiently execute basic cognitive operations, leaving more resources for complex tasks)
c) fluid intelligence (ability to reason through novel problems without relying on prior knowledge)
d) crystallized intelligence (accumulated knowledge and ability to apply learned skills)
e) attentional control / executive function (focus, suppress irrelevant information, switch between tasks, inhibit impulsive responses)
f) long-term memory and retrieval (ability to form strong associations and retrieve them fluently)
g) spatial / visuospatial reasoning (mental rotation, visualization, navigating abstract spatial relationships)
h) pattern recognition & inductive reasoning (this is the most primitive and universal expression of intelligence across species, the ability to extract regularities from noisy data, to generalized from examples to rules)
[-]
- Lerc 1 hour ago
  >a) working memory (hold & manipulate information in mind simultaneously)
  What counts as 'in mind' is undefined. You can succeed by declaring anything manipulatable counts as in.
  >c) fluid intelligence (ability to reason through novel problems without relying on prior knowledge)
  reasoning presupposes the conclusion. Solve is better. When a solution is given you cannot declare it to be not a solution. People can and do argue about if a answer was arrived at by reasoning even when they agree on the correctness.
  >g) spatial / visuospatial reasoning (mental rotation, visualization, navigating abstract spatial relationships)
  I have aphantasia, why should you exclude something from being intelligent because it cannot do something that I also cannot do.
qsort 1 hour ago
Those are crowdsourced benchmarks. We're calling them "cognitive" and "AGI" now, though. It's similar to when they made a benchmark and called it "GDP".
To be clear, I think we've seen very fast progress, certainly faster than I would have expected, I'm not trying to peddle some "wall" rhetoric here, but I struggle to see how this isn't just the SWE-bench du jour.
[-]
- Ygg2 1 hour ago
  AGI is defined now as "whatever makes 1 trillion dollars of profit".
- rigorclaw 15 minutes ago
  [dead]
baggachipz 32 minutes ago
This is a long way to say "let's crowdsource the shifting of our goalposts".
lvoudour 1 hour ago
Social cognition: processing and interpreting social information and responding appropriately in social situations
Is social cognition really a measure of intelligence for non-social entities?
[-]
- doginasuit 30 minutes ago
  An AI designed to interact with humans is a social entity. Its performance will depend on its ability to understand social information.
- lnenad 1 hour ago
  It is not. Why is that relevant to social entities?
  [-]
  - lvoudour 1 hour ago
    How well you interact with other members of a society increases your chances of procreation, survival, knowledge acquisition, ie. it makes sense as a measure of intelligence
    [-]
    - LogicFailsMe 52 minutes ago
      It's a pretty ambiguous definition. The most powerful man in the world right now is not someone I consider a role model for social cognition and yet there he is with the football for the second time demonstrating grandmaster skill at social cognition to get there.
      [-]
      - lvoudour 43 minutes ago
        You don't have to be empathetic and nice, just good at navigating society.
Havoc 36 minutes ago
Measuring something you can’t define or quantify seems somewhat dubious
[-]
- nutjob2 22 minutes ago
  Thus the vague and unfounded criteria/framework.
  It's pretty easy for these people to pull something like this out of their collective asses, but it's much harder (maybe impossible) to rigorously define the how and why.
hbarka 1 hour ago
The two guys from Google get to set the rules?
How will they measure wisdom or common sense (ability to make an exception)?
https://youtu.be/lA-zdh_bQBo
[-]
- Lerc 1 hour ago
  They are not the rules. They are some rules.
wewewedxfgdf 1 hour ago
AGI feels like a vanity project.
Who cares about AGI? Honestlky what's the gain.
Maybe Google could actually make Gemini good instead of being about 10 miles behind Claude instead of trying to make AGI because of - well some reason - cause they want to be famous.
wcgan7 1 hour ago
Cool that we are at a stage where it is meaningful to start measuring progress toward AGI. Something I am wondering on the philosophical side: are we ever going to be able to tell if the system really "understands" and "perceives" the world?
[-]
- quantummagic 1 hour ago
  We'll get as close as we can with anything else, like trying to decide if a given human really "understands" and "perceives" the world.
  [-]
  - Schlagbohrer 1 hour ago
    I thought of this when I saw that the final criteria in the list is Social Understanding. Might be a lot of humans who can't measure up to sentience by these parameters! ;-)
    (and I wonder what my ADHD friends would think of the Executive Function requirement as well...)
- beeflet 17 minutes ago
  I think the accomplishment of difficult real-world tasks requires that it does so. But I hope that we're able to reach a level of introspection to produce a satisfactory answer (and avoid doomsday), but I think that requires a more educated question. The premise of conciousness as we understand it now could be misleading.
  In the same way that studying alien life would reveal more about how life in general canonicially forms and exists. Studying this artificial intellegence could unlock a new understanding of our own minds.
zug_zug 58 minutes ago
I'm sorry what even is this? Giving $10k rewards for significant advancements toward "AGI"?
What does "making a framework" even mean, it feels like a nothing post.
When I think of what real AGI would be I think:
- Passes the turing test
- Writes a New York Times Bestseller without revealing it was written by AI
- Writes journal articles that pass peer review
- Wins a Nobel Prize
- Writes a successful comedy routine
- Creates a new invention
And no, nobody is going to make an automated kaggle benchmark to verify these. Which is fine, because an LLM will never be AGI. An LLM can't even learn mid-conversation.
[-]
- voxleone 8 minutes ago
  >> An LLM can't even learn mid-conversation.
  There’s an implicit assumption that scaling text models alone gets us to human-like intelligence, but that seems unlikely without grounding in multiple sensory domains and a unified world model.
  What’s interesting is that if we do go down that route successfully, we may get systems with something like internal experience or agency. At that point, the ethical frame changes quite a bit.
- stingraycharles 41 minutes ago
  I get the feeling that the original post was also written using LLMs, it doesn’t make a lot of sense.
  If an LLM like this is really intelligent, at the very least, I’d expect it to be able to invent.
  For example, train an LLM on a dataset only containing knowledge from before nuclear energy was invented, and see if it can invent nuclear energy.
  But that’s the problem: they’re not really training the model on intelligence, they’re training it on knowledge. So if you strip away the knowledge, you’re left with almost nothing.
- ixtli 55 minutes ago
  They’re slowly redefining AGI so they can use it for more marketing. If you showed someone from 1960 our LLMs from and told them “this is AI” I think they’d be astounded but a little confused because “artificial intelligence” definitely carried a very clear meaning in literature and media. Now it is marketing terminology and we’re no closer to having a meaningful definition for the word intelligence.
  [-]
  - paganel 5 minutes ago
    > They’re slowly redefining AGI so they can use it for more marketing.
    If they don't do that then those trillions of dollars that support their current share price will most probably evaporate, so there are very big incentives for them to just outright try and re-create reality (like what we usually meant when we were thinking about artificial intelligence).
- ahoka 44 minutes ago
  I find it very interesting about the Turing test that as chatbots improve, so do humans get better at recognizing them.
- sourcegrift 55 minutes ago
  Grok recently created a cancer vaccine for a dog that reduced tumor size by 75%
  [-]
  - 10xDev 52 minutes ago
    Severely misleading statement.
boca_honey 26 minutes ago
Friendly reminder:
Scaling LLMs will not lead to AGI.
[-]
- beeflet 24 minutes ago
  Who attuned your crystal ball?
  LLMs are already pretty general. They've got the multimodal ones, and aren't they using some sort of language-action-model to drive cars now? Who is to say AGI doesn't already exist?
  [-]
  - airstrike 21 minutes ago
    It doesn't already exist, pretty obviously.
    https://www.youtube.com/watch?v=YeRS4TbtZWA
  - nutjob2 19 minutes ago
    It's a trick statement, because AGI is undefined.
    [-]
    - beeflet 11 minutes ago
      I think LLMs are at least name-worthy given that they're artificial and somewhat smart in a generality of domains. Albeit the "smartness" comes from training in a massive corpus of text in those domains. So maybe it's really a specific intelegence but for so many specific tasks it seems general.
      At some point you have to throw in the towel when these things are going to be walking and talking around us. Some people move the goalposts of "AGI" to mean that the machine totally emulates a person. Including curiosity and creativity, of which these models are currently lacking.
      But why should it? In genesis, it's said that god created man after its own image. I have to assume this implies we inherit god's mental attributes (curiosity, creativity, etc.) rather than its physical attributes.
causalzap 5 minutes ago
[dead]
nbnmbnmbnbm 46 minutes ago
[dead]
speefers 44 minutes ago
[dead]