Head over to our on-demand library to view sessions from VB Transform 2023. Register Here
Yesterday’s release of Meta’s LLaMA 2, under a commercial license, was undoubtedly an open source AI mic drop. But startup Together, known for creating the RedPajama dataset in April, which replicated the LLaMA dataset, had its own big news over the past couple of days: It has released a new full-stack platform and cloud service for developers at startups and enterprises to build open source AI — which, in turn, serves as a challenge to OpenAI when it comes to targeting developers.
The company, which already supports more than 50 of the top open-source AI models, will also support LLaMA 2.
Founded last year by Vipul Ved Prakash, Ce Zhang, Chris Ré and Percy Liang, Together says it is “on a mission to make AI models more open and accessible in a market where Big Tech players are currently leading innovation.” The Menlo Park, CA-based startup announced in May that it had raised $20 million in a seed funding round to build open-source generative AI and a cloud platform.
“There is a clear debate between and open source and closed systems, and now there is an open source ecosystem that is getting stronger,” Prakash told VentureBeat, explaining that the company is increasingly seeing enterprises move towards open source because of a desire for data privacy. And now, “there’s more adoption of open source models because open source models are getting stronger.”
VB Transform 2023 On-Demand
Did you miss a session from VB Transform 2023? Register to access the on-demand library for all of our featured sessions.
New API and compute cloud services for leading open source AI models
Last Friday, the company launched the Together API and Together Compute, cloud services to train, fine-tune, and run the world’s leading open-source AI models. Together API is powered by “an efficient distributed training system to fine-tune large AI models, offering optimized, private API endpoints for low-latency inference.” For AI/ML research groups who want to pre-train models on their own datasets, Together Compute offers clusters of high-end GPUs paired with Together’s distributed training stack.
The result is far-greater cost efficiency, said Prakash. “It’s $4 an hour for an A100 GPU on AWS — we have created a technology where we can host instances of a model for a user — for example, hosting a RedPajama 7 billion parameter model on an A100 on our platform is 12 cents an hour.”
Together can do that, he explained, because of something the Wall Street Journal reported on last month: A huge supply of used GPU chips left in the wake of changes in cryptocurrency mining.
“Tens of millions of GPUs became available after the Ethereum network—home to the second-biggest crypto, behind bitcoin—removed the need for these chips by ending the practice of mining for new coins and the intensive computation it required,” the article said, and about 20% of those chips can be repurposed to train AI models. Together has leased thousands of these GPUs to help power its new cloud services.
There will be parallel closed and open ecosystems
While Together is certainly challenging OpenAI and other closed, proprietary model companies, particularly in the enterprise space, Prakash said he believes there will be parallel closed and open ecosystems.
“My personal feeling is that the closed model companies will eventually get more app centric,” he said, pointing to Character AI and its efforts in consume-focused chatbots. “They do that really well and their modeling efforts are sort of getting more and more focused in that in that direction.”
Similar to other fields from operating systems to databases, open source AI will be a more broadly applicable set of technologies, he explained. “I do think it will become difficult for closed models to charge a premium given that there are open solutions that exist and are now good for many problems.”
New chief scientist and Snorkel AI partnership
In addition to the platform news, Together announced this week that it has hired a new chief scientist, Tri Dao, who recently graduated with a Ph.D in computer science at Stanford and is also an incoming assistant professor at Princeton University. Most notably, Dao is known for his FlashAttention breakthrough research to improve training and inference of LLMs, which is now broadly used by all Transformer based models. FlashAttention-2 is now available — which speeds up training and fine-tuning of LLMs by up to 4x and achieves 72% model FLOPs utiliation for training on NVIDIA A100s.
In addition, this week Together also announced a partnership with Snorkel AI to enable organizations to build custom LLMs on their data in their secure environments. The end-to-end AI development solution spans data development, model training, fine-tuning, and deployment.
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.