Large language models' surprise emergent behavior written off as 'a mirage'
Forget those huge hyped-up systems, that smaller one might be right for you. And here's why
iconThomas Claburn
Tue 16 May 2023 // 11:38 UTC
ANALYSIS GPT-3, PaLM, LaMDA and other next-gen language models have been known to exhibit unexpected "emergent" abilities as they increase in size. However, some Stanford scholars argue that's a consequence of mismeasurement rather than miraculous competence.
As defined in academic studies, "emergent" abilities refers to "abilities that are not present in smaller-scale models, but which are present in large-scale models," as one such paper puts it. In other words, immaculate injection: increasing the size of a model infuses it with some amazing ability not previously present. A miracle, it would seem, and only a few steps removed from "it's alive!"
The idea that some capability just suddenly appears in a model at a certain scale feeds concerns people have about the opaque nature of machine-learning models and fears about losing control to software. Well, those emergent abilities in AI models are a load of rubbish, say computer scientists at Stanford.
Flouting Betteridge's Law of Headlines, Rylan Schaeffer, Brando Miranda, and Sanmi Koyejo answer the question posed by their paper, Are Emergent Abilities of Large Language Models a Mirage?, in the affirmative.
Continued
https://www.theregister.com/2023/05/16/large_language_models_behavior/