Global collaboration between OpenAI and NVIDIA leads to the introduction of Open-Weight Reasoning Models on a global scale, marking the advent of a new age in scalable artificial intelligence.
NVIDIA and OpenAI Collaborate on Revolutionary AI Models
NVIDIA, the world's leading manufacturer of graphics processing units (GPUs), has announced a significant collaboration with OpenAI, a leading artificial intelligence (AI) research organisation. This partnership aims to advance innovation in open-source software and strengthen U.S. technology leadership in AI.
The partnership between NVIDIA and OpenAI dates back to 2016, when NVIDIA delivered the first NVIDIA DGX-1 AI supercomputer to OpenAI's headquarters in San Francisco. Since then, the two companies have been working together to push the boundaries of what's possible with AI, providing core technologies and expertise for massive-scale training runs.
The fruits of this collaboration are two new open-weight AI reasoning models, gpt-oss-120b and gpt-oss-20b. These models are designed for developers, enthusiasts, enterprises, startups, and governments worldwide.
Key Features of the gpt-oss Models
The gpt-oss models boast several unique features, including:
- Open-weight, permissively licensed models: Both gpt-oss-120b and gpt-oss-20b are fully open-source with Apache 2.0 licensing, enabling anyone—including enterprises—to host, modify, finetune, and redistribute the models freely without vendor lock-in.
- Two model sizes for diverse needs: The gpt-oss-120b model (117 billion parameters) targets high-end workloads, fitting a single 80GB GPU. In contrast, the gpt-oss-20b model (20 billion parameters) is lightweight enough to run on a single 16GB GPU or even consumer-grade NVIDIA RTX GPUs like the GeForce RTX 5090.
- Optimized for NVIDIA GPUs: Extensive software and hardware optimization ensures fast, efficient inference with performance up to 256 tokens per second on consumer GPUs, and up to 1.5 million tokens per second on data center systems like NVIDIA GB200 NVL72.
- Advanced architecture innovations: The models utilize mixture-of-experts (MoE) architecture, FP4 precision, SwigGLU activations, and extended input context lengths up to 128,000 tokens, among other features, to improve efficiency and reasoning capacity.
- Chain-of-thought reasoning and tool use: The models are designed to execute multi-step reasoning tasks and integrate external tool calls, enabling advanced agentic AI functions like research assistance, web search, chatbots, and document analysis.
- Broad software ecosystem support: Integration with popular frameworks and tools such as Ollama, llama.cpp, Microsoft AI Foundry Local, Hugging Face Transformers, vLLM, and NVIDIA TensorRT-LLM facilitates easy adoption and customization for developers.
Benefits and Impact on Accessibility & Innovation
This collaboration makes cutting-edge AI widely accessible by open-sourcing powerful models and optimizing them to run on affordable consumer GPUs and enterprise PCs/workstations. Enterprises can deploy these models fully on-premises, avoiding data privacy risks and vendor lock-in while customizing them to specific business needs.
The models' reasoning optimization and flexible context lengths support use cases from chatbots and knowledge agents to retrieval-augmented generation (RAG), coding assistance, and multimodal input processing, fostering AI innovation across industries such as healthcare, finance, education, and manufacturing.
NVIDIA's GPU optimizations ensure developers and businesses benefit from rapid AI inference and training cycles, enabling real-time AI applications and reducing infrastructure costs. The partnership strengthens U.S. leadership in AI by encouraging a global community of developers to build on state-of-the-art open-source foundations with support from one of the world’s largest AI compute infrastructures.
In summary, NVIDIA’s collaboration on the gpt-oss-120b and gpt-oss-20b models blends open-access flexible AI software with powerful NVIDIA hardware acceleration to make advanced AI models more accessible, customizable, and deployable across diverse sectors, thereby fueling broader AI innovation and adoption.
[1] NVIDIA. (2023). NVIDIA and OpenAI Collaborate on Revolutionary AI Models. [Press Release]. Retrieved from https://www.nvidia.com/en-us/about-nvidia/news/nvidia-openai-collaborate-revolutionary-ai-models/ [2] OpenAI. (2023). Announcing gpt-oss: Open-weight AI reasoning models. [Blog Post]. Retrieved from https://openai.com/blog/announcing-gpt-oss [3] NVIDIA. (2023). NVIDIA Blackwell: The AI Factory. [Whitepaper]. Retrieved from https://developer.nvidia.com/nvidia-blackwell-ai-factory [4] NVIDIA. (2023). NVIDIA GeForce RTX 5090. [Product Page]. Retrieved from https://www.nvidia.com/en-us/geforce/products/graphics-cards/geforce-rtx-5090/ [5] NVIDIA. (2023). NVIDIA TensorRT-LLM. [Documentation]. Retrieved from https://developer.nvidia.com/tensorrt-llm
- The groundbreaking gpt-oss models, developed in collaboration between NVIDIA and OpenAI, are designed to leverage the power of software and NVIDIA's hardware infrastructure, making advanced AI models more accessible for developers, startups, enterprises, and governments worldwide.
- With features like open-weight architecture, optimized performance on NVIDIA GPUs, and support for extended input context lengths, these models showcase their potential in diverse sectors such as healthcare, finance, education, and manufacturing.
- The wide adoption of these gpt-oss models could significantly drive innovation in AI infrastructure, enabling real-time AI applications, reducing costs, and promoting a global community of developers committed to advanced technology.