OpenAI CEO Sam Altman speaks during a keynote address on the integration of ChatGPT for Bing at Microsoft in Redmond, Washington on February 7, 2023.
Jason Redmond | AFP | Getty Images
Before OpenAI’s ChatGPT came along and captured the world’s attention with its ability to create compelling offers, a small startup called Latitude captivated consumers with its AI Dungeon game, which allowed them to use artificial intelligence to create fantastical stories based on their clues.
But as AI Dungeon became more popular, Latitude CEO Nick Walton recalled that the costs of maintaining a text-based role-playing game began to skyrocket. AI Dungeon’s text generation software provided the GPT language technology offered Microsoft– artificial intelligence research laboratory OpenAI. The more people played AI Dungeon, the bigger the bill Latitude had to pay OpenAI.
The predicament was compounded by the fact that Walton also discovered that content marketers were using AI Dungeon to create ad copy, a use for AI Dungeon that his team never anticipated, but which ended up adding an AI account to the company.
At its peak in 2021, Walton estimated that Latitude was spending nearly $200,000 a month on so-called generative AI software OpenAI and Amazon Web Services to keep up with the millions of user requests it had to process each day.
“We used to joke that we had human and AI staff, and we spent about the same amount on each,” Walton said. “We were spending hundreds of thousands of dollars a month on AI, and we’re not a big startup, so it was a very big expense.”
By the end of 2021, Latitude has switched from using OpenAI GPT software to cheaper but still capable language software offered by startup AI21 Labs, Walton said, adding that the startup has also incorporated open source and free language models into its services. , to reduce the cost. Latitude’s generative AI bills have dropped to less than $100,000 a month, Walton said, and the startup is charging players a monthly subscription for more advanced AI features to keep the cost down.
Latitude’s expensive AI bills highlight an unpleasant truth behind the recent boom in generative AI technologies: the cost of developing and maintaining software can be extraordinarily high both for firms developing the underlying technologies, commonly referred to as big language or framework models, and those that use artificial intelligence to power their own software.
The high cost of machine learning is an uncomfortable reality in the industry as venture capitalists look at companies that could potentially be worth trillions, and big companies like Microsoft Metaand Google use its considerable capital to develop a leadership role in a technology that smaller challengers cannot catch up to.
But if the profitability of AI applications is consistently lower than the previous profitability of software-as-a-service, due to the high cost of computing, this may stop the current boom.
The high cost of training and “outputting”—actually running—large language models is a structural cost different from previous computing booms. Even when the software is built or trained, it still requires a huge amount of processing power to run large language models because they perform billions of calculations every time they return an answer to a prompt. In comparison, serving web applications or pages requires much less computation.
These calculations also require special equipment. Although traditional computer processors can run machine learning models, they are slow. Much of the learning and inference now happens on GPUs, or GPUs, which were originally designed for 3D games but have become the standard for artificial intelligence applications because they can perform many simple calculations simultaneously.
Nvidia makes most of the GPUs for the AI industry, and its main data center chip costs $10,000. The scientists who build these models often joke that they “melt GPUs.”
Educational models
Nvidia A100 processor
Nvidia
Analysts and technologists estimate that the critical process of training a large language model such as GPT-3 can cost more than $4 million. Training more advanced language models can cost upwards of “high single-digit millions,” said Rowan Curran, a Forrester analyst who specializes in AI and machine learning.
For example, the largest Meta LLaMA model released last month used 2,048 Nvidia A100 GPUs to train 1.4 trillion tokens (750 words is about 1,000 tokens), which took about 21 days, the company said when it released the model last month months.
It took about 1 million GPU hours to train. with special pricing from AWS, it will cost more than $2.4 million. And with 65 billion parameters, it is less than OpenAI’s current GPT models, such as ChatGPT-3, which has 175 billion parameters.
Clement Delange, CEO of AI startup Hugging Face, said the process of training Bloom’s large language model took more than two and a half months and required access to a supercomputer that was “something like the equivalent of 500 GPUs. “
Organizations that build large language models should be careful about retooling software that helps software improve its capabilities because it’s very expensive, he said.
“It’s important to understand that these models are not trained all the time, like every day,” Delang said, noting that this is why some models, such as ChatGPT, are unaware of recent events. He said that the knowledge of ChatGPT will end in 2021.
“We are currently training the second version of Bloom, and the retraining will cost no more than $10 million,” Delange said. “So that’s something we don’t want to do every week.”
Conclusion and who pays for it
Bing with chat
Jordan Naveth | CNBC
To use a trained machine learning model to make predictions or generate text, engineers use the model in a process called “inference,” which can be much more expensive than training because it can take millions of times for a popular product.
For such a popular product as ChatGPT, which, according to investment company UBS, has hit 100 million monthly active users in January—Kerran estimates that OpenAI could have cost $40 million to process the millions of clues people entered into the software that month.
Costs skyrocket when these tools are used billions of times a day. Financial analysts estimate that Microsoft’s Bing AI chatbot, which runs on OpenAI’s ChatGPT model, would require at least $4 billion in infrastructure to serve responses to all Bing users.
For example, in Latitude’s case, while the startup didn’t have to pay to train the underlying OpenAI language model it had access to, it had to factor in output costs that were something like “half a cent per call” for “a couple of million requests per day ” said a representative of Latitude.
“And I was relatively conservative,” Curran said of his calculations.
To sow the seeds of the current AI boom, venture capitalists and tech giants are pouring billions of dollars into startups specializing in generative AI technologies. For example, according to media reports in January, Microsoft invested about $10 billion in the GPT supervisor OpenAI. Sales departmentThe venture capital arm of Salesforce Ventures recently debuted a $250 million fund catering to generative AI startups.
As an investor Semil Shah of venture capital firms Haystack and Lightspeed Venture Partners described on Twitter: “Hungarian venture capital has moved from subsidizing your taxi ride and burrito delivery to Masters and generative AI computing.”
Many entrepreneurs see the risk in relying on potentially subsidized AI models that they don’t control and simply pay for each use.
“When I talk to my AI friends at startup conferences, this is what I tell them: don’t just depend on OpenAI, ChatGPT, or any of the other big language models,” said Suman Kanuganti, founder of personal.ai, a chatbot, which is currently in beta mode. “As businesses change, they’re all owned by big tech companies, right? If they disable access, you’re gone.”
Companies such as enterprise technology firm Conversica are exploring how they can leverage the technology through Microsoft’s Azure cloud service at the current discounted price.
While Conversica CEO Jim Cascade declined to comment on how much the startup is paying, he acknowledged that the subsidized costs are welcome as they explore how language models can be used effectively.
“If they were really trying to break even, they would be charging a lot more,” Cascade said.
How can it change
It is not yet clear whether AI computing will remain expensive as the industry evolves. Basic model companies, semiconductor manufacturers, and startups all see business opportunities in lowering the price of using artificial intelligence software.
Nvidia, which has about 95% of the AI chip market, continues to develop more powerful versions designed specifically for machine learning, but improvements in the industry’s overall chip power have slowed in recent years.
However, Nvidia CEO Jensen Huang believes that in 10 years, artificial intelligence will be “a million times” more efficient due to improvements not only in chips, but also in software and other computer components.
“Moore’s Law, on its best days, would have achieved a hundredfold over a decade,” Huang said on an earnings call last month. “By inventing new processors, new systems, new interconnections, new frameworks and algorithms, and working with data scientists, artificial intelligence researchers on new models, all that range, we made processing a large language model a million times faster.”
Some startups have focused on the high value of AI as a business opportunity.
“No one said, ‘You have to build something that was purpose-built for the conclusion.’ What would that look like?” said Sid Sheth, founder of D-Matrix, a startup building a system to save money on output by doing more processing in the computer’s memory as opposed to the GPU.
“Today, people are using GPUs, NVIDIA GPUs, to do most of their inference. They buy the DGX systems that NVIDIA sells and which cost a ton of money. The problem with the conclusion is that the workload grows very quickly, which happened with ChatGPT , it grew to a million users in five days. Your GPU power can’t keep up with it because it wasn’t built for it. It was created for learning, for speeding up graphics,” he said.
Delangue, the CEO of HuggingFace, believes that more companies would be better off focusing on smaller, specific models that are cheaper to train and operate, instead of large language models that attract more attention.
Meanwhile, OpenAI announced last month that it is lowering the cost for companies to access GPT models. Now it is charging one-fifth of one cent for about 750 words.
OpenAI’s lower prices caught the attention of Latitude, which builds Dungeon.
“I think it’s fair to say that this is definitely a huge change that we’re excited to see in the industry, and we’re constantly evaluating how we can give users the best possible experience,” said a Latitude spokesperson. “Latitude will continue to evaluate all AI models to make sure we have the best game.”
Look: AI’s “iPhone moment” – separating ChatGPT’s hype from reality
