Tuesday, September 17, 2024

Iterative’s New DataChain Enables Use of AI Models to Evaluate the Quality of Unstructured Data

Related stories

Cohesity & CrowdStrike Expand for Better Threat Detection

Cohesity, a leader in AI-powered data security, announced an...

Salesforce Unveils Agentforce–What AI Was Meant to Be

Agentforce is how humans with AI drive customer success...

Massive Bio & Foundation Medicine Enhance Trial Recruitment

Massive Bio, a cutting-edge AI driven clinical trial matching...

Metify & SecEdge Form Secure “Root of Trust” Alliance

Metify.io, a pioneer in secure zero-touch server provisioning (ZTP),...

Intelecy announces Anna Olsson’s promotion to Chief Operating Officer

Intelecy, a leading Norwegian industrial AI company, is excited...
spot_imgspot_img

Iterative, the company dedicated to streamlining the workflow of artificial intelligence (AI) engineers and creator of widely-used open-source projects in MLOps, announced the upcoming release of DataChain, a new open-source tool for processing and evaluating unstructured data.

According to McKinsey’s Global Survey on the state of AI published in early 2024, only 15 percent of surveyed companies have realized a meaningful effect of generative AI (GenAI) on their business to date. A large part of the problem lies in the challenge of processing unstructured data at scale and estimating the results which is traditionally cumbersome – and stems from the missing link between the structured data technologies and the newer AI workflows based in Python. While the (older) analytical databases provided full control over the data quality, unstructured multimodal data like text and images proved much harder to assess and improve at scale.

Also Read: Riveron Acquires Yantra

“The biggest challenge in adopting artificial intelligence in the enterprise today is the lack of practices and tools for data curation and generative AI evaluation that can ensure the quality of results,” said Dmitry Petrov, CEO of Iterative. “As the next step, we need AI models that can evaluate and improve AI models. So far this has only happened at the industry forefront – take a look at DeepMind’s AlphaGo training against itself, or OpenAI’s DALL-E3 curating its own dataset. Our goal is to change this.”

The proliferation of sophisticated AI foundational models opens the door to intelligent curation and data processing. However, the absence of easy solutions to wrangle unstructured data using AI models in easy-to-manage formats keeps the technology barrier high. In practice, most AI engineers are still building custom code for converting their JSON model responses, adapting them to databases, and running models in parallel with out-of-memory data.

DataChain democratizes the popular AI-based analytical capabilities like ‘large language models (LLMs) judging LLMs’ and multimodal GenAI evaluations, greatly leveling the playing field for data curation and pre-processing. DataChain can also store and structure Python object responses using the latest data model schemas – such as those utilized by leading LLM and AI foundational model providers.

Source: GlobeNewsWire

Subscribe

- Never miss a story with notifications


    Latest stories

    spot_img