Newswire

OctoML introduces new compute service to unlock generative AI innovation

By Newsroom Jun 14, 2023 10:33am

OctoML today announced OctoAI, the industry's first self-optimizing compute service for AI. The new platform offers developers a fully-managed cloud infrastructure designed to abstract away the complexity of building and scaling AI applications. OctoAI provides the freedom to run, tune and scale the models you choose, including off-the-shelf, open-source software (OSS) and custom models. With OctoAI, developers now have easy access to cost efficient and scalable accelerated computing, so they can focus on building high-performance cloud-based AI applications and deliver great user experiences for their customers.

To help developers quickly build on the latest and greatest models, OctoAI is also introducing a library of the world's fastest and most affordable generative AI models—powered by the platform's model acceleration capabilities. OSS foundation model templates available at launch include Stable Diffusion 2.1, Dolly v2, Llama 65B, Whisper, FlanUL, and Vicuna.

"AI is no longer a novelty, it's real business. But efficient compute is critical to making it viable," said Luis Ceze, CEO, OctoML. "Every company is scrambling to build AI-powered solutions, yet the process of taking a model from development to production is incredibly complex and often requires costly, specialized talent and infrastructure. OctoAI makes models work for businesses, not the other way around. We abstract away all the complexity so developers can focus on building great applications, instead of worrying about managing infrastructure."