Core Disagreements in the AGI Discourse
Moving beyond stale debates to focus on core disagreements
Introduction
A chunk of the public AGI Discourse has mostly stalled.
This is indicative of a wrong focus for the debates. Instead of discussing core disagreements where we can make progress, we discuss disagreements that are too in the weeds.
This results in the erosion of good faith and good will between the participants, as they get tired of making very little progress. This dynamic should be avoided. I wrote more about it in this long-form tweet.
To that end, I'll try to pinpoint some core disagreements worth discussing and some downstream disagreements that I find more futile. I expect moving the debate closer to those core disagreements would lead to more progress.
Agreement: Intelligence Levels
To the extent that intelligence can be collapsed to a single dimension, after thinking about it a little, we all agree with the following ordering:
Rocks. 0 intelligence.
Plants. Some amount of intelligence: they interact with their environment.
Insects. They explore and communicate.
Mammals. They have legible feelings and trick other mammals.
Average Joe. Joe is average but definitely smarter than monkeys.
Einstein. Einstein is smarter than Joe and can come up with things that Joe never could.
Smartest Realistic AIs. The smartest AIs that we can realistically build. Einstein has existed, and we are reproducing all of intelligence. We would all be stunned if we could not build AIs that are more intelligent than Einstein.
Omniscience. This is the theoretical max for intelligence. But it is not physically possible.
Core Disagreement: Correct way to measure AI around human level
There is a disagreement on the correct way to measure AI around human level, and if so, what standard that is.
I do not think people reflect much about it. I think it is more of a vibe. I wrote about it in the post: There is no IQ for AI.
For instance, the Turing Test is not a good standard. It is a binary criterion which does not let you distinguish between two AIs that pass it or do not pass it. It is not standard either: there are too many set-ups for the Turing Test.
In some set-ups, the evaluators know they are testing an AI; in others, they don't. Some set-ups are non-interactive, where the test is about distinguishing AI-generated text from human-generated text rather than directly interacting with the participant.
My conclusion is that there is no such standard, no correct way to measure Artificial Intelligence around human level. If you think about it that way, most downstream disagreements dissolve and stop mattering.
Futile Disagreement: LLM Intelligence
We do not agree at which level are LLM-based systems.
Some say they are between 3 (insects) and 4 (mammals): Yann Lecun is famous for stating that our AIs are not even "cat-level" or "dog-level".
Some say that they are between 5 (average Joe) and 6 (Einstein), given that they pass the SAT and can program faster than most programmers.
Given the above core disagreement, it should make sense why discussing this is wasting time. There are no standards by which one could say that AIs are cat-level, dog-level, average-Joe-level or Einstein-level.
Futile Disagreement: LLMs are AGI
We do not agree on whether current LLM-based systems constitute AGI.
Some say they are AGI, given their ability to perform well on a wide range of tasks, from coding and logical reasoning to creative writing and humour.
Others say they are not AGI, arguing that they lack agency, planning and reliability.
Given the above core disagreement about measuring AI capabilities around human level, it should make sense why this debate is largely unproductive.
There are no agreed-upon standards by which one could definitively say whether LLMs are or are not AGI.
The disagreement usually stems from explicit definitions and differing implicit expectations of AGI. I have rarely seen it come from substantive differences in understanding of LLM capabilities.
Core Disagreement: Realistic Superintelligence
We do not agree on how smart our smartest realistic AIs can get in the next 20 years beyond "somewhat smarter than Einstein".
Some expect intelligence to cap at around human level. From their point of view, realistic superintelligence = Super-Einstein, at best.
Others are confident that intelligence and AI can reach much bigger heights. Enough to extract all the energy from the Sun and rebuild Earth on a molecular level. From their point of view, which I share, superintelligence = godlike entities. Godlike to us, in the same sense that we humans of 2024 are godlike to monkeys and to our oldest human ancestors.
Many lay in between those two beliefs. They would not bet that intelligence caps at around human level, nor would they consider godlike AI the most likely outcome. They just do not have any specific reason to lean one way or the other.
Futile Disagreement: Catastrophic Risks from AI
Depending on which probability we assign to the heights of superintelligence, we are worried about very different risks.
People who expect a godlike array of capabilities from AI worry about catastrophic accidents, where accidentally, we build AIs that wipe out humanity. You can read more about my stance on this question in this article: Preventing Extinction from Superintelligence.
People who expect the top of AI to be akin to Super-Einstein cohorts semi-automating large chunks of research worry more about misuse. They envision domestic and international terrorists using AI to build neo-weapons unlike anything seen before. They envision authoritarians using AI-enabled global totalitarian governments.
Some people do not care much for future AI capabilities. They worry more about bias and fairness. They envision AI creeping into all of our society's decision-making processes. They envision black box decisions made by AIs that can not be appealed and for which no person takes responsibility.
When people disagree on the heights of AI, it is a waste of time to discuss things like "p(doom)" or "AGI/ASI timelines". The expectations around AGI and ASI, as well as their definitions, are much too different. Where one means "a system smart enough to completely replace all humans", the other hears "robots with integrated ChatGPT".
Until we discuss and achieve more agreement on the realistic heights of AI in the next 10-20 years, I think the best we can do is list the main AI intellectual stances and their proponents. The stances should be named descriptively, not stupidly like "e/accs" or "doomers".
I think more of it in terms of the "Godlike ASI hypothesis", "Super-Einstein ASI hypothesis", or "Human-capped AI hypothesis". It is less memetic and fun, and that's the point.
After mapping these beliefs and which expert subscribe to them, a non-expert could build their own portfolio of beliefs and build policies around them.
Futile Disagreement: Criticality Threshold
By criticality threshold, I mean something that has been given many names: the intelligence explosion in 1966, later the technological singularity, recursive self-improvement, or even more recently, the sharp left turn.
The core idea is that it is likely that in the foreseeable future, AIs will be intelligent enough to improve themselves… 1: without humans in the loop; and 2: faster and faster. In which case, they would do so.
Some people do not expect AI to reach such a criticality threshold. But as you might guess, this is often downstream of just not believing that AIs will be very intelligent in the first place.
Conversely, the people who believe that AIs will be intelligent enough to reach a criticality threshold also expect AIs to become much more intelligent after reaching it.
Mapping this disagreement chain would look like this.
"AIs will become so intelligent that they can introspect, improve
themselves faster than humans would improve them, and do so in a
way that gets faster and faster over time."
↓ ↓
yes no
↓ ↓
AIs accelerate AIs stay below
well beyond that that point.
point.
Core Disagreement: Alignment Auto-Scaling
This core disagreement is usually held between people who already agree that godlike AI is possible. Sometimes, it is downstream of the disagreement on criticality, but I have seen people share the belief that critical thresholds exist and yet disagree on this question.
Quoting myself (you can find my full opinion on this point here):
Weak alignment is aligning LLM systems with benchmarks understandable to humans.
Strong alignment is aligning superintelligence.
Difficulty is irrelevant.
What matters is whether weak alignment helps strong alignment.
Alignment Auto-Scaling is the thesis that weak alignment naturally translates to strong alignment.
This is a major disagreement in the safety community. Many people work on weak alignment specifically because it might help with strong alignment.
For-profit companies whose C-Suite discussed catastrophic risks from AI regularly invoke this argument. It was first used by DeepMind, then OpenAI and finally Anthropic.
The main reasoning behind Alignment Auto-Scaling is "We will ask the weak AIs to align more and more powerful AIs until they reach superintelligence".
I believe this is wrong and will explain why in another post. In the meantime, before I state this argument, I want to outline the various disagreements downstream of it.
Futile Disagreement: AI Alignment people should be at the forefront of capabilities
People who care about extinction risks from AI and work at DeepMind, OpenAI and Anthropic often claim that AI Alignment people should be at the forefront of capabilities.
This is often invoked as a defence for doing capabilities work, even though working on capabilities reduces the time to work on alignment before reaching a criticality threshold.
While this can be used as a coping mechanism for people who find AI fun but want to feel like they are doing alignment, one can also see it as downstream of the above core disagreement. If Alignment Auto-Scales by default, then one should work on weak alignment with the best tools possible.
Conversely, if alignment did not Auto-Scale, staying at the forefront of capabilities would be much less useful, as working on Weak Alignment on SOTA artefacts would not help much.
Futile Disagreement: Superintelligence Alignment is just an engineering problem
I believe this one is the most trivial of the downstream disagreements seen so far.
If Alignment Auto-Scales, one can simply focus on Weak Alignment, an engineering problem.
It might seem tautological, but I have already witnessed many discussions about whether alignment was "easy" or "hard" without hitting the disagreement around Alignment Auto-Scaling once. As a result, people mostly talked past each other.
Conclusion
I wanted to give an example of what it looks like to go for core disagreements rather than re-hashing the same discussions that do not progress.
I would love it if people found better alternatives to the core disagreements than the ones I have identified.
What I would hate, though, is to see people re-hash the same downstream topics, talking past each other and eroding their mutual trust over time. This would be a waste.
Cheers, and have a nice day!