Thursday, May 8, 2025

Fastino Launches TLMs with $17.5M Seed Round

Related stories

spot_imgspot_img

Fastino AI unveiled a groundbreaking new AI model architecture, coined ‘TLMs’ – Task-Specific Language Models. Developed by in-house AI researchers from Google DeepMind, Stanford, Carnegie Mellon, and Apple Intelligence, TLMs deliver 99X faster inference than traditional LLMs and were trained on less than $100K in low-end gaming GPUs.

Fastino is also announcing $17.5M in seed funding led by Khosla Ventures, the first investor in OpenAI – bringing Fastino’s total funding to $25M. The round included participation from preseed lead investor Insight Partners as well as Valor Equity Partners, and notable angels including Scott Johnston, the previous CEO of Docker, and Lukas Biewald, the CEO of Weights & Biases.

As of today, developers can access the TLM API, which includes a free tier with up to 10,000 requests per month. The API is purpose-built for specific tasks, with the first models including:

  • Summarization: Generate concise, accurate summaries from long-form or noisy text, enabling faster understanding and content distillation.
  • Function Calling: A hyper-efficient model designed for agentic systems, enabling precise, low-latency tool invocation – ideal for integrating LLMs into production workflows.
  • Text to JSON: Convert unstructured text into structured, clean, and production-ready JSON for seamless downstream integration.
  • PII Redaction: Redact sensitive or personally identifiable information on a zero-shot basis, including support for user-defined or industry-specific entity types.
  • Text Classification: A versatile zero-shot model for any labeling task, equipped with enterprise-grade safeguards including spam and toxicity detection, out-of-bounds filtering, jailbreak detection, and intent classification.
  • Profanity Censoring: Identify and redact profane language to ensure content compliance and brand safety.
  • Information Extraction: Extract structured data – such as entities, attributes, and contextual insights – from unstructured text to support use cases like document processing, search query parsing, question answering, and custom data detection.

Also Read: Cloudera Launches New Machine Learning Accelerators

“We started this company after our last startup went viral and our infrastructure costs went through the roof. At one point, we were spending more on language models than on our entire team. That made it clear: general-purpose LLMs are overkill for most tasks. So we set out to build models that worked for devs,” said Ash Lewis, CEO and co-founder of Fastino. “Our models are faster, more accurate, and cost a fraction to train while outperforming flagship models on specific tasks.”

Trained on NVIDIA gaming GPUs for less than $100,000, Fastino’s TLMs can inference on low-end hardware such as CPUs or gaming GPUs. Although significantly smaller than current industry models with trillions of parameters, Fastino’s models deliver market-leading accuracy and inference 99.67X faster than existing LLMs. The specialized architecture achieves better accuracy as tasks become more well-defined.

“Large enterprises using frontier models typically only care about performance on a narrow set of tasks,” said Jon Chu, Partner at Khosla Ventures. “Fastino’s tech allows enterprises to create a model with better-than-frontier model performance for just the set of tasks you care about and package it into a small, lightweight model that’s portable enough to run on CPUs, all while being orders of magnitude faster with latency guarantees. These tradeoffs open up new use cases for generative models that historically haven’t been practical before.”

Fastino is breaking from industry standards with a flat monthly subscription that eliminates per-token fees, allowing developers to access the complete TLM suite with predictable usage costs. For enterprise customers, Fastino TLMs can be deployed within a customer’s Virtual Private Cloud (VPC), on-premise data center, or at the edge, allowing enterprises to maintain control over sensitive information while leveraging advanced AI capabilities.

“AI developers don’t need an LLM trained on trillions of irrelevant data points – they need the right model for their task,” said George Hurn-Maloney, COO and co-founder of Fastino. “That’s why we’re making highly accurate, lightweight models with the first-ever flat monthly pricing – and a free tier so devs can integrate the right model into their workflow without compromise.”

SOURCE: Businesswire

Subscribe

- Never miss a story with notifications


    Latest stories

    spot_img