Site icon AIT365

IBM has a new way to train large language models for enterprise

A faster

IBM’s new synthetic data generation method and phased-training protocol allows enterprises to update their LLMs with task-specific knowledge and skills, taking some of the guesswork out of training generative AI models.

Modern chatbots have become astoundingly good at generating conversations in the voice of a pirate or summarizing reports in the style of an accountant. But that’s not always the case: they can still go off-topic or provide incorrect information.

Much of their uneven performance comes down to training data. For most bots, that’s raw text scraped from the internet, followed by task-specific information generated by either humans or machines that’s added during fine-tuning (or alignment).

The large language models (LLMs) behind modern chatbots are pre-trained on the raw text to learn an abstract representation of language. This then primes them to learn many tasks quickly once they see labeled, detailed instructions during alignment.

But quality instruction data is hard to come by. It’s laborious and expensive for humans to make, and typically lacks the depth and breadth that chatbots need to guide them through difficult, rare, or ambiguous situations. Synthetic data costs a lot less, but it often suffers from a similar lack of variety.

Also Read: Cloudian Launches Open Source PyTorch Support, Enabling Simplified Machine Learning Workflows with Hybrid Edge Storage Integration for AWS Outposts and Local Zones

IBM has a new solution: Large-scale Alignment for chatBots, or LAB. It’s a method for systematically generating synthetic data for the tasks you want your chatbot to accomplish, and for assimilating new knowledge and capabilities into the foundation model — without overwriting what the model has already learned. With LAB, LLMs can be drastically improved in far less time and at a lower cost than is typically spent training LLMs.

“Instruction data is the lever for building a chatbot that behaves the way you want it to,” said Akash Srivastava, chief architect of LLM alignment at IBM Research. “Our method allows you to write a recipe for the problems you want your chatbot to solve and to generate instruction data to build that chatbot.”

IBM’s data-generation method is driven by a taxonomy that allows LLM developers to define the knowledge and skills they want to add to their chatbot. The taxonomy maps out the LLM’s existing knowledge and skills in a logical, hierarchical way, giving developers a framework to identify and fill in gaps with new knowledge and skills.

The taxonomy guides a second LLM, known as the teacher model, in generating high-quality instructions, formulated as pairs of questions and answers tailored to the task at hand. Let’s say you want a chatbot to be able to draft an email for a CEO summarizing their company’s third-quarter earnings. The task requires an understanding of financial statements, basic math and reasoning, and the ability to summarize financial information in an email that strikes the right tone.

IBM’s taxonomy works by segregating instruction data into three overarching categories: knowledge, foundational skills, and compositional skills that draw on knowledge and foundational skills.

Here, the data needed might include accounting knowledge, math skills, and a combination of writing and reasoning abilities for drafting a coherent email. The teacher model would generate instructions for each category while iteratively running quality control on its results.

In the first step of this hypothetical example, the LLM developer might upload the company’s financial statements, and several examples of how to calculate corporate earnings. The teacher model would then generate instructions grounded on the financial documents. This way, if accounting rules change, new instructions can be made.

On a second path, the teacher model generates instructions that will enable the base LLM to calculate the earnings. On a third path, the developer uploads sample earnings-report emails, and the teacher model generates more instructions that will enable the base model to write the desired email.

The teacher model also runs quality control checks on the data it generated. Acting as its own harshest critic, it discards irrelevant questions, and instructions containing incorrect information.

SOURCE: Newswire

Exit mobile version