Nvidia Shifts to Revenue-Sharing Model for AI Factory Deployments
Synopsis
Key Takeaways
Chip giant Nvidia Corporation announced on Thursday, 2 July 2026 that it is partnering with AI cloud providers to deploy large-scale, multi-tenant AI factories through a revenue-sharing and credit-support model, marking a structural shift in how the company monetises its compute infrastructure.
Context
In its post, Nvidia stated that 'AI is shifting from model training to always-on token production, and that shift demands a new business model.' The company said it would work with AI clouds to open up compute access to 'startups, model builders, enterprises, research organisations and regional AI players' — a broad coalition spanning the full AI value chain.
The announcement reflects a fundamental change in how AI workloads are structured. Where earlier cycles were dominated by episodic, large-scale model training runs, the industry is now moving toward continuous inference workloads — systems that generate tokens in production around the clock, requiring persistent, always-available compute capacity.
Policy Backdrop
Nvidia has been building toward this model for several years. In 2023, the company launched DGX Cloud — its full-stack AI supercomputing service — in partnership with Microsoft, Google, and Oracle, delivering high-performance training infrastructure through major cloud intermediaries rather than direct hardware sales.
In 2024, Nvidia unveiled its Blackwell GPU architecture, engineered for efficiency across both training and inference workloads. The revenue-sharing and credit-support model announced in July 2026 extends that trajectory: instead of requiring upfront capital expenditure from customers, Nvidia and its cloud partners absorb the infrastructure cost and recover it through usage-linked revenue arrangements.
This mirrors a pattern seen in earlier high-performance computing transitions, where specialised infrastructure — once accessible only to well-capitalised institutions — was progressively democratised through service models.
Stakeholders and Impact
AI startups and regional AI players stand to benefit most directly. Access to large-scale GPU clusters has historically required either significant capital or existing relationships with hyperscale cloud providers. A revenue-sharing structure lowers that barrier, allowing smaller organisations to deploy production inference workloads without front-loading infrastructure costs.
Enterprises and research organisations gain flexibility to scale token production capacity in line with demand, rather than committing to fixed hardware procurement cycles. For model builders — companies and teams developing foundation or fine-tuned models — the multi-tenant factory model offers a path to production deployment without owning dedicated clusters.
For Nvidia itself, the shift diversifies revenue beyond one-time hardware sales into recurring, usage-linked income streams tied to the growth of AI inference at scale.
What's Next
Specific partner names and revenue-sharing terms for the July 2026 arrangements have not yet been publicly disclosed. Detailed disclosures are expected in Nvidia's upcoming quarterly filings, which will clarify the financial structure and the identity of participating cloud partners.
Observers will also watch whether Nvidia expands the programme's scope at future GTC events, its principal developer conference, where major infrastructure and partnership announcements have historically been made. As the AI sector's centre of gravity shifts from training to inference, Nvidia's ability to capture recurring revenue from always-on token production could reshape its long-term business profile significantly.