The Allen Institute for AI (Ai2) has launched Olmo 3, a new suite of fully open large language models that not only deliver state-of-the-art performance but also expose the entire “model flow” from data collection to training checkpoints to users, fostering transparency, trust, and extensibility. At its heart is Olmo 3-Think (32B), the first open 32B-scale reasoning model whose intermediate reasoning traces can be inspected and traced back to its training data. The Olmo 3 family also includes compact Olmo 3-Base (7B and 32B) models, which outperform other fully open base models across benchmarks in math, coding, and reading comprehension (and support long contexts up to ~65K tokens), and Olmo 3-Instruct (7B), fine-tuned for chat, instruction-following, and tool use.
Also Read: Cognizant Launches AI Data Services for Enterprise Models
Additionally, Olmo 3-RL Zero (7B) enables reinforcement-learning-based experimentation with publicly released RL-trained checkpoints. In contrast to opaque, closed-weight models, Olmo 3 offers full traceability: all training data, code, model weights, and checkpoints are released under permissive open-source licenses, and users can intervene at any stage mid-training, post-training, or even pre-training to adapt the models for domain-specific use. Ai2’s data pipeline uses a new ~ 9.3-trillion-token Dolma 3 corpus plus a curated Dolci dataset for post-training, and their efficient training infrastructure (with H100 GPUs) significantly reduces cost and compute. By integrating OlmoTrace, users can inspect how model outputs relate to source training data in real time, enabling fine-grained auditing, debugging, and refinement. Olmo 3’s release represents a major step toward truly open, inspectable AI empowering researchers, developers, and institutions to build, understand, and extend large language models with unprecedented transparency and control.


