At 175b parameters it’s the largest of its kind. And with a memory size exceeding 350GB, it’s one of the priciest, costing an estimated $12m to train. Fortunately for competitors, experts believe that while GPT-3 and similarly large systems are impressive with respect to their performance, they don’t move the ball forward on the research side of the equation. Rather, they’re prestige projects that simply demonstrate the scalability of existing techniques.
“I think the best analogy is with some oil-rich country being able to build a very tall skyscraper,” Guy Van den Broeck, an assistant professor of computer science at UCLA, told VentureBeat via email. “Sure, a lot of money and engineering effort goes into building these things. And you do get the ‘state of the art’ in building tall buildings. But … there is no scientific advancement per se. Nobody worries about the U.S. is losing its competitiveness in building large buildings because someone else is willing to throw more money at the problem. … I’m sure academics and other companies will be happy to use these large language models in downstream tasks, but I don’t think they fundamentally change progress in AI.”