Thursday, June 25, 2026

OpenAI and Broadcom Introduce Custom “Jalapeño” Inference Chip Optimized for Large Language Models

Related stories

OpenAI and Broadcom have officially introduced Jalapeño, OpenAI’s inaugural Intelligence Processor. Architected as a highly tailored AI accelerator, the custom hardware is built exclusively around modern Large Language Model (LLM) inference. The development serves as the opening chapter of a multi-generational computing platform co-engineered by both companies to optimize the velocity, reliability, and global scaling of advanced artificial intelligence workloads.

The customized silicon chip was personally handed over by Broadcom’s highest executives, namely its president and CEO, Hock Tan, and president, Charlie Kawwas, to OpenAI CEO Sam Altman and president Greg Brockman. This marks an important point for OpenAI since the company will be developing the hardware that will power its future models and products.

OpenAI didn’t use existing hardware architecture but developed the silicon entirely from scratch. This blank-slate approach leverages the organization’s deep expertise in LLM behavioral dynamics, kernel optimization, and production-scale serving requirements. Broadcom and Celestica joined the initiative to industrialize the design, contributing vital expertise across silicon implementation, circuit board architecture, high-efficiency network integration, and high-volume rack-level manufacturing systems. Though customized around OpenAI’s unique architectural insight, Jalapeño retains the flexibility to process a diverse range of current and future industry LLMs. Laboratory engineering samples are already running active machine learning workloads including GPT-5.3-Codex-Spark at targeted production frequencies and power thresholds.

Also Read: Sharon AI Solidifies Asia-Pacific Footprint with Massive VAST Data Expansion to Fuel Next-Gen AI Factories

While final standardized benchmarks are still being gathered, preliminary validation indicates that Jalapeño will offer a massive upgrade in performance-per-watt efficiency compared to current industry benchmarks. A comprehensive technical performance disclosure is slated for publication in the coming months. The architectural layout actively limits internal data movement while balancing computational, memory, and networking assets to drive hardware utilization remarkably close to theoretical peak efficiency. Furthermore, Broadcom’s signature silicon manufacturing techniques and specialized Tomahawk networking components are being leveraged to transition the platform into high-volume commercial production.

Greg Brockman, President and Co-Founder of OpenAI “The world is moving to a compute-powered economy,” said Greg Brockman, President and Co-Founder of OpenAI. “Jalapeño is part of our long-term full-stack infrastructure strategy to make compute more abundant, resulting in AI which is faster, more reliable, more affordable for people and businesses, and can be used to solve more important problems. By designing more of the stack ourselves, we can serve more intelligence with greater efficiency and keep pushing advanced AI toward broader access.”

Richard Ho, Hardware Program Lead at OpenAI “Jalapeño was designed from the ground up for LLM inference using detailed insights from our close collaboration with OpenAI researchers,” said Richard Ho, who leads OpenAI’s hardware program. “We optimized the architecture around the kernels, memory movement, networking, and serving patterns that matter most for frontier AI models. Based on early testing, Jalapeño will efficiently execute our most important workloads close to the hardware’s theoretical limits.”

Hock Tan, President and CEO of Broadcom “Our collaboration with OpenAI represents a fundamental commitment to scaling the physical infrastructure required for the next decade of AI,” said Hock Tan, President and CEO, Broadcom. “This is just the beginning of a multi-generation roadmap. By co-developing our industry-leading silicon directly with OpenAI, we are enabling the deployment of gigawatt scale data centers with Microsoft and other partners beginning in 2026.”

The Full-Stack Infrastructure Advantage

Jalapeño abandons generic processing paradigms to solve the precise bottlenecks associated with deep-learning applications like ChatGPT, Codex, and upcoming autonomous agent ecosystems. The custom platform merges the extreme computing capacity and throughput typical of mainstream AI accelerators with the hyper-low latency characteristics found in hyper-specialized niche hardware, creating an optimal foundation for consumer-facing interactive products at massive scales.

By taking control of the hardware layer, OpenAI now operates across the entire computing stack spanning foundational research models, software kernels, data scheduling, and direct user applications. This allows engineering teams to harmonize separate layers toward a singular, unified objective: delivering highly optimized intelligence at lower operational costs.

This full-stack ownership kickstarts a powerful development flywheel. Enhanced silicon layout translates directly into computing efficiencies. Lower overhead allows for the training and execution of more complex systems, yielding superior consumer tools. Increased performance drives user engagement and corporate adoption, which yields the capital necessary to fund subsequent hardware generations. Over time, this cyclical loop aims to lower the global cost of digital intelligence.

[Infographic detailing OpenAI’s full-stack flywheel: Chip Architecture to Software Kernels, Frontier Models, Consumer Products, and Reinvestment]

Unprecedented Nine-Month Tape-Out Window

The timeline from the initial conceptual blueprints to the final silicon manufacturing tape-out was achieved in just nine months. This aggressive development pipeline is believed to represent the briefest application-specific integrated circuit (ASIC) design lifecycle recorded within the advanced semiconductor sector.

This rapid turnaround was fueled by integrated hardware-software collaboration, Broadcom’s manufacturing expertise, and the deployment of OpenAI’s own neural networks to automate and optimize specific chip design parameters. This feedback loop showcases a shift where current production models are actively refining the silicon required to train and run the next wave of artificial intelligence.

Multi-Generational Roadmap for Global Scaling

Jalapeño serves as the first foundational piece of a larger, multi-generational computing roadmap scheduled for initial physical deployment by late 2026. The infrastructure blueprint fuses OpenAI’s specialized accelerator layouts with Broadcom‘s routing and interface silicon, backed by Celestica’s physical rack-integration engineering.

Ultimately, this initiative recognizes that inference is the primary vector through which AI interacts with humanity. Every step forward in compute efficiency manifests directly as faster interface responses, longer autonomous reasoning steps with minimal delay, and lower pricing tiers across developer API networks. Through this hardware rollout, the organization intends to stabilize and democratize advanced AI models for students, small businesses, and researchers globally.

Subscribe

- Never miss a story with notifications


    Latest stories