IPA, an artificial intelligence-driven biotherapeutic research and technology company, announced the development of a Foundation AI Model that represents a significant advancement in life sciences research and development. The Company’s model uniquely combines the strengths of Large Language Models (LLMs) through an advanced stacking technique with BioStrand’s patented HYFT Technology. The HYFT’s ability to pinpoint unique ‘fingerprints’ in biological sequences enables the stacked LLMs to apply their vast knowledge base with greater specificity, leading to more accurate predictions and insights. This integration marks a pivotal moment in the utilization of artificial intelligence for complex biological data analysis and drug discovery.
Unveiling the Intricacies of HYFT Technology
Central to the success of BioStrand’s Foundation AI Model is its utilization of its patented HYFT technology, a sophisticated framework designed to identify and leverage universal fingerprint™ patterns across the biosphere. These fingerprints act as critical anchor points, encompassing detailed information layers that bridge sequence data to structural data, functional information, bibliographic insights, and beyond, serving as the great connector between disparate realms of knowledge. BioStrand’s platform core is built upon a comprehensive and continuously expanding knowledge graph, mapping 25 billion relationships across 660 million data objects, and linking sequence, structural, and functional data from the entire biosphere to written text such as scientific literature, providing a holistic understanding of the relationships between genes, proteins, and biological pathways.
The seamless integration of HYFTs with stacked LLMs enables the BioStrand AI model to decode the complex language of proteins, unlocking insights crucial for antibody drug development and precision medicine.
Also Read: Ibex and PathPresenter Launch Partnership to Accelerate Adoption of AI-powered Digital Pathology
Large Language Models (LLM), originally developed for Natural Language Processing (NLP), can also be applied on “the language of proteins” enabling insights into tasks including, but not limited to, protein structure prediction, antibody binding optimization, and protein mutagenesis.
To understand ‘the language of proteins’, it is essential to detect meaningful words and word boundaries. This is where the HYFTs serve as critical enablers. By harnessing HYFT’s sophisticated computational capabilities, the previously abstract notion of identifying functional units or “words” in protein sequences is made tangible, allowing for precise mapping and analysis.
The Advanced Foundation AI model employs a distinctive approach known as “LLM stacking” to intelligently combine different LLMs, with the HYFTs linked to specific features found in various LLMs. Using a natural language analogy, this would mean one is able to distinguish the meaning of ‘apple’ based specifically on the context of the word, in other words, is the word “apple” referring to a type of fruit versus ‘Apple’, Silicon Valley pioneer. In a life sciences context, these features, for example, could include identification of critical amino acid residues involved in protein binding or detecting sequence variations associated with disease susceptibility. The sequence diversity harnessed by the HYFTs was discovered during the clustering of Next Generation Sequencing data sourced from IPA’s pipeline subsidiary, Talem Therapeutics, utilizing the HYFT network combined with LLM stacking. Through the incorporation of various features provided by LLM stacking in this study, it was possible to differentiate between binding and non-binding antibodies, even when they shared similar HYFT patterns.
Pioneering a New Frontier in Life Sciences
The concept of “word boundaries” within protein languages offers a groundbreaking approach to unlocking the complexities of protein structure and function, filling a void in the knowledge base of researchers and drug developers alike. By enabling precise identification and manipulation of functional units within proteins, this innovative methodology paves the way for advancements in drug discovery, protein-based therapeutics, and synthetic biology. It promises not only to accelerate the development of targeted treatments with higher efficacy and lower side effects but also to revolutionize protein engineering and design. This approach, leveraging cutting-edge computational models and analysis techniques, stands to significantly reduce research and development timelines and costs .
SOURCE: BusinessWire