Thursday, November 14, 2024

Untether AI Releases Early Access to imAIgine Software Development Kit Supporting speedAI Inference Acceleration Solutions

Related stories

Jumio Names Mike Nawrocki as New Chief Revenue Officer

Jumio, the leading provider of automated, AI-driven identity verification,...

Innovaccer Launches AI for Personalized Healthcare

Healthcare Experience AI is set to increase patient engagement,...

DataRobot Unveils First-Ever Generative AI Tools for Safe AI

Add-on AI observability along with one-click compliance documentation and...

Snowflake Launches AI Data Agents: Future of Enterprise AI

Snowflake Intelligence is a groundbreaking platform that will empower...

Fastino Unveils 1000x Faster LLMs, No GPUs Required

Fastino, a new foundation AI model provider, launched to...
spot_imgspot_img

Untether AI®, the leader in energy-centric AI inference acceleration, announced the availability of early access (EA) of its imAIgine® Software Development Kit (SDK) supporting the speedAI® inference acceleration solutions.

The imAIgine SDK provides a push-button flow, streamlining the process of converting trained neural network models into optimized, inference-ready models to be run on speedAI acceleration solutions. This latest EA release supports the speedAI family of devices and PCIe accelerator cards, which set a new industry benchmark of energy efficiency and 2000 TFLOPs of AI inference performance per device.

“Providing the early access version of the imAIgine SDK enables users to prepare their neural networks for the upcoming shipment of speedAI devices and cards,” said Philip Lewer, Sr. Director of Product at Untether AI. “With an extensive array of model garden and kernel support, automated compilation, and sophisticated analysis tools, this EA release gives users everything they need to easily deploy their models on the revolutionary speedAI family of inference acceleration solutions.”

Push-button flow for simple model deployment

The imAIgine SDK provides an automated path to running neural networks on Untether AI’s inference acceleration solutions, with push-button quantization, optimization, physical allocation, and multi-chip partitioning. Supporting either TensorFlow or PyTorch, a few simple python commands quantize, lower, physically allocate, and run the models on speedAI hardware in a matter of minutes. With a comprehensive model garden library and kernel support users can quickly run classification, object detection, semantic segmentation, or natural language processing (NLP) models on speedAI hardware. Sophisticated, automated quantization techniques convert the neural network to the preferred datatype. For the utmost in accuracy, post-quantization training (PQT) and knowledge distillation algorithms are available to maintain accuracy after quantization. During compilation the imAIgine SDK performs layer-fusion optimizations, graph-lowering, kernel mapping, and physical allocation to provide an optimal implementation result.

Also Read: AMD to Acquire Silo AI to Expand Enterprise AI Solutions Globally

Power-user flow for low-level optimizations

With the power-user flow, users can directly develop optimized “bare metal” kernels for the over 1,400 RISC-V processors and over 350,000 at-memory compute processing elements in speedAI devices. Analogous to CUDA, but written in familiar C/C++, these kernels are directly compiled using a modified version of LLVM, enhanced to take advantage of the over 30 custom instructions Untether AI has added to the instruction set for its ultra-efficient at-memory compute architecture. Users can then manually place the kernels in any topology on the memory banks of the speedAI spatial architecture.

Extensive suite of analysis tools including virtual hardware

Within the imAIgine SDK there are several tools to analyze how networks are running on the speedAI devices, providing a virtual hardware view prior to receiving actual devices. The Model Explorer shows the entire floorplan of how the neural network is mapped to the silicon, enabling interactive inspection of connection topology, socket depth, and performance estimates. This can be enhanced by the Analysis Dashboard to provide information on processor activity, packet exchanges, and utilization. All of these tools provide a virtual hardware environment to help guide the user for optimal efficiency and performance.

Source: Businesswire

Subscribe

- Never miss a story with notifications


    Latest stories

    spot_img