Tuesday, November 5, 2024

CAMB.AI Introduces MARS5, World’s Most Capable Synthetic Speech Emulator

Related stories

Absci and Twist Bioscience Collaborate to Design Novel Antibody using Generative AI

Absci Corporation a data-first generative AI drug creation company, and...

GreyNoise Intelligence Discovers Zero-Day Vulnerabilities in Live Streaming Cameras with the Help of AI

GreyNoise Intelligence, the cybersecurity company providing real-time, verifiable threat...

Medidata Launches Bundled Solutions to Support Oncology and Vaccine Trials

Medidata, a Dassault Systèmes brand and leading provider of...

Blend Appoints Mike Mischel as SVP of AI Consulting

Blend, a leader in data science and AI-powered solutions,...

Patronus AI Launches Industry-First Self-Serve API for AI Evaluation and Guardrails

Patronus AI announced the launch of the Patronus API, the first...
spot_imgspot_img

CAMB.AI, a pioneer in content localization, has open-sourced and released the world’s most capable synthetic speech emulator, MARS5. Able to accurately replicate performances in over 140 languages, it requires just a few seconds of audio and text input to capture even incredibly prosodic performance.

MARS5’s English-language model is now open-source on GitHub. CAMB.AI encourages the developer, artificial intelligence and research community to build and learn with what they’ve produced. The other 140+ languages are available on CAMB.AI’s proprietary platform.

Co-founder and CTO Akshat Prakash explains, “The level of prosody and realism that MARS5 is able to capture, even with just a few seconds of input, is unprecedented. This is a mistral moment in speech.”

MARS5 combines a mistral-style autoregressive model with a novel non-autoregressive model to capture emotion, performance, and meaning, like never before. It performs well even in traditionally difficult scenarios like sports commentary, movies, and anime which closed-source and open-source TTS models aren’t able to capture well today.

Also Read: Unbabel Releases TowerLLM, the First Generative AI Model to Outperform GPT-4o, GPT-3.5 and Lead the Market in Machine Translation

This breakthrough was achieved with the continued support of AWS for GPU compute resources and NVIDIA infrastructure. As an NVIDIA Inception member, CAMB.AI uses the Triton Inference Server to scale inference for its two models MARS and BOLI, serving millions of requests worldwide.

Internally, CAMB.AI uses MARS5, alongside its own proprietary translation model BOLI, for its core dubbing and translation functions. It can transform English performances into even low-resource languages like Icelandic, Swahili, and even some Indigenous dialects.

Dedicated to an open-source philosophy that encourages collaboration and innovation, Prakash believes that “CAMB.AI stands as a testament to genuine AI research happening beyond traditional hubs. Innovation need not know any one language, color, or geography. The future belongs to everyone.”

Source: Businesswire

Subscribe

- Never miss a story with notifications


    Latest stories

    spot_img