Hammerspace ingests $56M for a new approach to work with vast amounts of unstructured data | TechCrunch

4 min read


Data may be “the new oil,” but only when it can be extracted and put to use. Today, a five year-old startup called Hammerspace that is giving any kind of data that lease of life is announcing $56 million in funding — its first institutional investment — as it expands its business.

Prosperity7 Ventures — the venture arm of Saudi Aramco — is leading this first outside round, with ARK Invest, Pier 88 Hedge Fund, Samsung and other unnamed investors also participating.

Hammerspace was initially funded, co-founded and led by David Flynn, the pioneer technologist known for his early work on Linux, supercomputers and flash computing. And while it may not be a household name, it is already working with a number of very large companies and organisations that you will know, which have giant data needs.

Its customers include Jeff Bezos’ Blue Origin; the National Science Foundation; and Royal Caribbean Group. Major media groups are also using it to manage their data around special effects development (it’s been used for effects in Star Wars and Stranger Things, among other outsized productions). And at least one “super scaler,” in the words of Flynn, which he declined to name, is using Hammerspace to manage troves of unstructured data that it’s currently using in the building and training of Large Language Models across tens of thousands of GPUs. (Note: I have a strong hunch of who this is based on his response to a name I gave him, and the other partners Hammerspace works with.)

“If you’re going to pay big bucks for GPU horsepower, the last thing you want is for that to site to sit idle, waiting for data to flow in and flow out of those systems,” Flynn said. “We provide radical input to feed data into and out of those training systems. It’s a data pipeline feeding into and out of those models at high speed and with the convenience of a real file system.”

Hammerspace is named after the concept first coined from cartoons and comics where characters pull out objects they need out of thin air, and without getting too technical, that might also be the best way of explaining what the startup does. Essentially, it provides a way of making large amounts of data — regardless of where it lives or how it gets used — accessible and available to an organisation just when they need it, and keeping it out of the way when they do not.

Flynn at first declined to describe the startup as in the area of data orchestration, or file management, or a pipeline, or data management — he is very personable and accessible, but also quick to be technical and thus very exact in his language — but frankly it does cover all of these to a degree.

Companies that need to use vast amounts of data in a project like building a new AI will typically find it a challenge to access and manage the data they need not only because of the sheer volume, but because it is unstructured, and also lives in many different places, across clouds, local servers, and more — hybrid architectures for very messy amounts of information.

Although companies like Snowflake have made a tidy business of managing structured data in such architectures for the purposes of business intelligence, the same does not apply in the unstructured datasets in which Hammerspace specializes, and for the kinds of applications that this kind of data is used for, which might include business intelligence but might also be for AI processing.

Hammerspace’s technology breakthrough to cope with that is thanks in part Flynn’s realisations early on, when he was working on flash storage at Fusion.io, about how this would pose a problem for businesses down the line. And in part, it’s thanks to the foundational work of his co-founder and CTO, Trond Myklebust, himself a bit of a legend in the world of computing, with his track record including being the maintainer and lead developer of the Linux kernel NFS client. The “file system” that Hammerspace has built for managing, moving, and orchestrating data is based on a particular implementation in Linux; and what it does, Flynn said, “is unique across the industry.”

The thinking is that unstructured is potentially where the business opportunity lies for these applications and more. Hammerspace cites data from IDC that estimates more than 90% of business information “is likely to be formed of unstructured data by 2025.” And that’s in part why investors are interested.

“The information of our world is increasingly decentralized, and companies, now more than ever, need to access and move unstructured data out of silos and across platforms, making that data more useful and valuable,” said Cathie Wood, CEO of ARK Invest, in a statement. “Our mission is to capitalize on technological convergence across markets and industries, thereby changing the way the world works. Hammerspace aligns with that mission, unlocking new innovations across the enterprise.”


Source link