AI pushes data processing into parallel universe

  • WEKA is a data storage company who provides enterprises with parallel file system access for AI training

  • The company said its technique helps improve utilization of costly GPU resources

  • Gartner’s Chandra Mukhyala told us competition in this arena will soon heat up

You may not have heard of them, but WEKA is one of those workhorse companies working behind the scenes to help power the artificial intelligence (AI) revolution. What they do is not necessarily new, but at this exact moment in time it is very much needed and the company is one of only a handful currently offering the tools it does.

WEKA is a data storage company. But what currently makes it special is the shared parallel file system it offers.

Unlike other systems that only allow one file to be opened and processed at a time, parallel systems distribute file data across multiple servers to allow multiple files to be accessed simultaneously.

Gartner Senior Director and Analyst for Storage and Data Management Chandra Mukhyala told Fierce that parallel file systems have been around for a while in the high-performance computing world. But with the rise of AI, there’s suddenly high demand for them in the enterprise arena as well.

Why? Well, it takes millions of files to train an AI model. So, you can imagine how valuable the ability to process multiple files at once would be. Even moreso when you account for the fact that training is taking place on pricey GPUs, which might otherwise sit idle for long stretches of time waiting for one file after another to be opened and processed.

To put it into perspective, WEKA’s VP of Product Marketing Colin Gallagher told Fierce that one of its customers went from roughly 30% GPU utilization to 70% after switching to its system.

Customers and competition

WEKA counts 23andMe, Stability AI, Applied Digital, Genomics England and NexGen Cloud among its customers. It also processed the data for U2’s show at the Sphere in Las Vegas. 

Word on the street is that the company is working with Tesla to help it train its autonomous driving AI (WEKA declined to confirm, but we have it on good authority). Then there are its customers in the media and entertainment space, like Parliament VFX, who use its service to speed production and editing work.

In terms of competitors, Mukhyala said WEKA is one of many players in the distributed file systems game, but one of three primary vendors offering parallel file systems. The other two are IBM and DDN. But Mukhyala added that is likely to change in the near future.

“The specific advantage they have is a short term advantage because I expect every vendor in this market to add similar capabilities” in the coming years, he told Fierce. “It’s becoming part of standard Linux access protocols. When that becomes a standard, everybody’s going to have it so then they’re not going to have a unique advantage.”

The road ahead

According to Gallagher, WEKA initially launched as a cloud-only data platform on Amazon Web Services about six or seven years ago. It later brought the same code to on-premises deployments and expanded to other cloud providers including Microsoft’s Azure, Google Cloud and Oracle.

He noted one of the additional perks of using WEKA is that customers don’t need a secondary storage platform – for instance, Amazon’s S3 – in addition to its service. That said, it’s still an uphill battle convincing enterprises not to go with the native storage option included with the cloud of their choice.

Looking ahead, Gallagher said the big goal is to expand its relationships with its newer hyperscale cloud partners and GPU cloud specialists, and strengthen its sales channels. Additionally, it’s working on new features focused on data management at scale and running its software on ARM-based processors which could help lower costs.

He added it’s also working to increase brand awareness – an effort which he said has slowly started to pay off in recent months with name recognition at Nvidia’s recent GTC and analyst conferences.

The clock is ticking, though.

“I definitely expect them to have growth but they will have more competition going forward,” Mukhyala concluded