A wave of custom AI inference chips from AWS (Trainium3), Google (TPU v6), and Microsoft (Maia 2) has driven inference costs down 60% year-over-year. NVIDIA responded by cutting H200 pricing and announcing aggressive B300 availability timelines. The cost to run a frontier AI model has fallen below $0.001 per 1K tokens on optimized infrastructure.
Get the complete analysis with detailed charts, methodology, and data appendices.
Week of March 24, 2026: AWS Announces Graviton5, Azure Cuts VM Pricing 12%
March 27, 2026
Week of March 17, 2026: Snowflake vs Databricks: The Data Platform Pricing War Escalates
March 20, 2026
Week of March 3, 2026: CNCF Releases Kubernetes Cost Management Standards v1.0
March 6, 2026
Get the IFO4 weekly briefing delivered to your inbox every Friday.
Subscribe Now