Stardog launches Voicebox, an LLM-powered layer to query enterprise data

4 min read


VentureBeat presents: AI Unleashed – An exclusive executive event for enterprise data leaders. Network and learn with industry peers. Learn More

Washington, DC area startup Stardog, a company that helps the U.S. Department of Defense and many other government agencies manage, query and reason with their structured and unstructured data, today announced an LLM-powered conversational layer aimed at simplifying access to business insights.

Officially dubbed Voicebox, the solution will be available as part of Stardog’s flagship platform, allowing users to ask questions using ordinary language and get answers based on enterprise data— without needing any technical skill.

The move marks the latest effort to loop in large language models to simplify how teams work with data, joining the likes of Kinetica, Databricks, Dremio and many other data ecosystem players.

“It’s hard to overstate this solution’s impact on competitiveness and profitability as universal access to relevant data has long been one of the biggest obstacles to getting work done,” Kendall Clark, cofounder and CEO of Stardog, said in a statement. “Self-serve analytics is no longer the exclusive preserve of technical folks who’re able to program.”


AI Unleashed

An exclusive invite-only evening of insights and networking, designed for senior enterprise executives overseeing data stacks and strategies.


Learn More

How does Voicebox work?

At the core, Voicebox works like any other natural language processing (NLP) bot, where the user just has to type their query to get an answer.

However, the AI layer does more than just pull from raw data on a customer’s platform: it interfaces with the Enterprise Knowledge Graph of Stardog.

The Enterprise Knowledge Graph connects to all data sources within a company, enriches information from those sources with relevant context, and creates human and machine-understandable knowledge — all designed to provide answers through Voicebox based on timely, trusted and accurate enterprise data.

“It all works by taking in natural language and using our models to turn human intent into structured graph queries that Stardog executes,” Clark told VentureBeat.

The solution eliminates the task of writing queries for the knowledge graph platform, offering “data democratization” to all knowledge workers, and produces answers that are free from hallucinations. 

Prompt engineering and agent techniques

Notably, Stardog also uses cutting-edge prompt engineering and agent techniques for summarizing schemas, doing data integrity checks and generally making user input safe, trusted, and contextually relevant for querying.

Clark further pointed out that “Stardog’s knowledge graph platform also includes additional services like entity resolution, hybrid AI inference, and federated graph streams and it’s this backend that Voicebox opens up for everyone, regardless of their technical skill.”

Currently, the company is using an ensemble of finetuned models based on two open-source projects and trained on data from a crowdsource project, R&D and synthetic datasets.

It will also add a self-hosted LLM into the mix to offer more commercial flexibility as well as to create a more competitive offering for customers. The timeline for this, however, remains unclear at this stage.

Plans for Voicebox

While Clark did not share the names of enterprises using the new conversational AI layer, he did confirm that early access has been given to dozens of existing customers and new prospects, including those in manufacturing and pharmaceuticals.

“We’re talking to them regularly during the program to learn how they want to include LLM in their data projects, what benefits they’re seeing and what they’re looking for. Most of our customers and our growth programs focus on risk management and compliance in financial services; drug discovery and supply chain management in pharma; and Product360 and factory of the future in manufacturing,” he said.

For now, the company is working on introducing SMS and WhatsApp support for Voicebox, making sure that the question-answering abilities are fully integrated into the digital workflow of users. It is also looking at the possibility of introducing support for voice prompts, although there is no set timeline for this shared publicly.

Since its launch in 2015, Stardog has raised over $23 million in funding and roped in customers like Boehringer Ingelheim, Schneider Electric, NASA and the Department of Defense for its Enterprise Knowledge Graph. According to Forrester, the platform can provide an ROI of 320% and total benefits of over $9.86 million over three years.

LLMs helping with data challenges

Even though Stardog has the advantage of its knowledge graph, it is not the only one working to make data access easier with large language models.

In recent months, a number of enterprises have moved to simplify different aspects of data handling with generative AI. Kinetica launched a ChatGPT integration, followed by its own LLM, for querying data; Snowflake launched Document AI for unstructured data search; and Databricks debuted LakehouseIQ as a generative AI “knowledge engine” that allows anyone to search, understand and query internal corporate data by simply asking questions.

Informatica has also made a move in this space by launching Claire GPT to help users discover, interact with and manage their data assets via language prompts.

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.


Source link