Very often, I get asked the defiant "Gabe, why do you think you can do better than this genius/company/institution/NGO/research lab? Don't you know that markets are efficient?"
Much more rarely (but not never!), I get asked a more flattering "Gabe, how come you managed to do better than this genius/company/institution/NGO/research lab?"
I would love for my answer to be "It's because I'm simply so much smarter and gifted by the Gods!" And sometimes, I give it jokingly.
But the true answer is more prosaic. It is that focusing on complex goals is hard. And empirically, both individuals and groups usually fall prey to one of 4 common distractions.
My goal here is not to give advice. I just want to draft a linkable explanation for each of these four dynamics.
1. Compulsions
A compulsion is a drive, an urge to act a certain way. Pleasure usually creates a small compulsion, but indulging in a compulsion is not necessarily pleasurable.
Consider: competitions and debates where we can not stop engaging, scrolling social media, watching one more episode of a Netflix show, eating more than what we savour and need, drugs, OCD, etc.
They're all compulsions. And many people and groups are driven by their compulsions, more than by their goals.
Hardcore activists love violence, regardless of how much it helps with their political goals.
Nerds love their puzzles, regardless of how much they are beneficial or not.
Researchers have a type of research that they like to do, regardless of whether it is an important problem to work on.
Addicts are addicted, regardless of the impact on their lives.
In AI Safety, I met multiple researchers who claim they want to help with preventing the extinction risks from superintelligence. And yet, they work on mechanistic interpretability, without a clear explanation for why that's the best they can do.
When asked to explain their reasoning, they wanted to do research on LLMs, and it seemed like a fun and appealing type of research.
A similar excuse is that it is their "comparative advantage". In this context, comparative advantage is just pseudo-scientific jargon that means "I specialized in this, and don't intend to change".
It is quite easy to be more successful at a complex goal than people who just want to have fun, do what's pleasurable, follow their drives, and prevent themselves from learning things outside of their zone of focus.
2. Traumas
Traumas are the opposite of compulsions. While compulsions drive someone to do or get something, traumas drive them to avoid it.
Both individuals and groups can have trauma.
Taboos are a core example of group traumas. Even though individuals can easily privately think about the taboo topic, the group cannot talk about it publicly nor act in its direction.
A group can also have a trauma by virtue of all of its members sharing it. For instance, in AI Safety, there are many nerds who are good at technical topics who hate the idea of talking or exchanging with politicians. Politicians have more authority than they do, and may disagree with them in a way that they do not respect! Those are two things that nerds hate.
As a result, nerds systematically underestimate all plans that involve sustained interactions with politicians, or only in Machiavellian ways, rather than treating them like fellow human beings.
It is quite easy to be more successful at a complex goal than people who just reject the obvious solutions and the efficient plans. Scrub mentality.
3. Trolls
Coming up with a precise explanation for what I mean by "a troll" is hard, but let's try.
Complex questions and problems usually have countless plausible answers to them. In this situation, someone trying to solve them will struggle quite a lot. A troll is an answer to such a problem that…
Comes from the external world (a friend, a podcaster, some book)
Was selected and optimized on any other ground than being the best answer
Yet, still seems plausible
Trolls are extremely pernicious. Because they are plausible, there is no reason for the person to be defiant a priori.
However, complex questions and problems are… complex. Coming up with a good answer is already hard. But when we're trying to do something else at the same time, like not triggering a trauma or satisfying someone's interests, it's just impossible.
This is why trolls are bad. When bringing a plausible answer to someone's attention, their immediate reaction is to consider it. After all, if an argument is strong enough, it shouldn't matter who says it. But it does, because we do not have infinite attention.
This is how concern trolling, gish gallop, FUD and many other tactics work. They bring up plausible answers, arguments, and solutions to our attention just to attack it.
In AI, there's a lot of safety washing. People who want to build AGI as fast as possible will come up with many plausible-sounding arguments for why racing for AGI is good, or for why pausing and slowing down is bad.
Many of them work at AGI companies. This can be the CEOs, like Dario Amodei or Demis Hassabis; employees, like Jack Clark and Holden Karnofsky; or friends, like Paul Christiano.
Evaluations, Responsible Scaling Policies, Voluntary Commitments are all trolls that aim at not pausing AI right now.
And the biggest of them all is the idea that working on the state of the art is useful for AI Safety. In OpenAI's Planning for AGI and beyond, the explanation is "we can incrementally publish things, which is safer than publishing things in bulk". In Anthropic's Core Views on AI Safety, it is that effective empirical research cannot be done on small models. I don't think DeepMind has a comparable doc, but Demis' opinion is that working on the state of the art is important to stay in control of the development of AI.
It is possible to deconstruct these trolls with some effort. But the core thing to notice is that you shouldn't need to. Sam, Dario, and Demis will all come up with arguments to justify why their orgs are racing.
For instance, Anthropic has been fundraising based on recursive self-improvement and winning the race for a while1. Whatever argument Anthropic comes up with has to be compatible with this fundraising strategy. If they come up with 10 arguments, it doesn't mean that they are 10x more right, it means that they searched for them for 10 times as long.
Similarly, OpenAI talks about how incremental deployment is good for safety. But at the same time, it has been raising and investing in as much compute infrastructure as possible, making the deployment much less incremental and much more abrupt by racing.
By trying to deconstruct a troll, we give it attention, and it has already spread and served its purpose.
For instance, when an employee of the company puts out a statement called "RSPs are actually the right way to pause AI", I recommend to discard it, not argue against it.
We have finite attention, and I strongly recommend not spending it on the latest PR iteration of companies that are raising around and racing at becoming the ones who will build AGI.
Else, you end up with tweets like this: "Consider! Would it be good for safety to actually push the race forward?"
Dan Hendrycks is not a nobody. He is the director of the Center for AI Safety, which has pushed for SB1047 and is the one behind the famous landmark open letter on extinction risks from AI.
Yet, he focuses on convoluted arguments, gets trolled, and further trolls others. This behavior is pervasive in "AI Safety". By quoting his tweet, I am not singling him out. This happens in private all the time, over time, everywhere, at all levels. This is just an extremely telling and simple example, that happens to be public.
Dealing with compulsions and traumas can be quite hard, especially when one is too smart for their own good.
But dealing with trolls is orders of magnitude harder. Trolls are created, evolved, and iterated upon with the collective effort of entire communities. Thousands to millions of people will constantly generate plausible-sounding arguments for their side and the best get selected.
When someone is part of a group, there's also an expectation of assuming good faith to fellow group members, and to give them the time of day. As a result of this expectation of giving them attention and credence, trolls coming from groups to which we belong are especially potent.
I don't really have simple advice here beyond "It sucks and be aware of it."
On this, let's move on to the most implacable phenomenon: entropy.
4. Entropy
Complex questions and problems require sustained effort over a long period of time.
But it's hard. We hit stark diminishing returns, the endeavor stops being rewarding, we get bored and give up. We fall prey to a routine or inertia and fail to reorient once our course of action stops being the best one. Other priorities compete for our time, attention, and resources. We get made fun of whenever we act a certain way. Life happens, something changes, an accident occurs, and everything goes to shit.
In general, the universe moves forward, and never in the direction we want to. So we must put some effort into steering into the direction we care about. And because we do not know of any methods to keep the universe moving in the direction we care about, we must constantly put in effort in our long-term endeavors. Else, they just decay.
In short: Everything needs maintenance. And entropy is the reason why.
Conclusion
Again, the goal of this article is not really to share advice.
I will likely write at some point about the principles I use to be more resilient to them, but unfortunately, they are not trivial.
In the meantime, my main piece of advice would be "Pay attention to this. This applies to everyone. Including you. Including me."
Anthropic's $5 billion 4-year plan to take on OpenAI has been specifically predicated on raising billions to win the race against OpenAI. To do so, it planned to use a "next-gen algorithm for AI self-teaching". The pitch outlined that "These models could begin to automate large portions of the economy," the pitch deck reads. "We believe that companies that train the best 2025/26 models will be too far ahead for anyone to catch up in subsequent cycles."