Member-only story
So AI Won’t Scale, Now What?
Rumours that the famed Transformer architecture that powers GPT, Claude and Llama, won’t keep getting better with more compute (the scaling law) have been around for a while. Even recently, AI companies and pundits have been denying that this is the case, hyping upcoming model releases and raising obscene amounts of money. However, over the last few days leaks and rumours have emerged suggesting that the big silicon valley AI companies are now waking up to the reality that the scaling law might have stopped working.
So what happens next if these rumours are true and where does that leave the AI bubble? How quickly and likely are we to break out of a plateau and get to more intelligent models?
Update: Overnight Ilya Sutskever himself has added his voice to the choir of experts claiming that the scaling laws have hit a wall. This is a significant inflection point since Sutskever is responsible for leading much of OpenAI’s work on GPT-3 and GPT-4 before leaving to start his own research company earlier this year. He is big on AGI and superintelligence so this admission carries a lot of weight.*
A Brief History of LLMs and How We Got Here
To those unfamiliar with AI and NLP research, ChatGPT might appear to have been an overnight sensation from left field. However, science isn’t a “big bang”…