
Google Cloud launches two new AI chips to compete with Nvidia
```json { "title": "Google Launches TPU 8t and 8i AI Chips to Challenge Nvidia", "metaDescription": "Google splits its eighth-generation TPU into two purpose-built AI chips — the TPU 8t for training and TPU 8i for inference — announced at Cloud Next 2026.", "content": "<h2>Google Unveils TPU 8t and TPU 8i at Cloud Next 2026, Splitting AI Chip Architecture for the First Time</h2><p>Google made one of its most significant hardware announcements in years on April 22, 2026, at Google Cloud Next in Las Vegas: the company is splitting its eighth-generation Tensor Processing Unit (TPU) into two purpose-built chips — the <strong>TPU 8t</strong>, designed for AI model training, and the <strong>TPU 8i</strong>, designed for inference. The announcement, made at Mandalay Bay and reported across CNBC, SiliconANGLE, and 9to5Google, represents a direct architectural departure from Google's previous TPU generations and signals an escalating push to compete with Nvidia in the fast-growing AI infrastructure market.</p><p>The new chips are not available for purchase separately. They are only accessible through Google Cloud, meaning organizations must use Google's cloud platform to take advantage of the hardware.</p><h2>What the New Chips Can Do: Performance Numbers That Matter</h2><p>The performance claims Google is making for both chips are substantial, though the company notably stopped short of directly comparing them to Nvidia's offerings.</p><p>The <strong>TPU 8t</strong> training chip delivers <strong>2.8 times the performance of the seventh-generation Ironwood TPU for the same price</strong>, according to CNBC. A single TPU 8t superpod scales to 9,600 chips and 2 petabytes of shared high-bandwidth memory, delivering 121 ExaFlops of compute. Google claims the TPU 8t can reduce the frontier model development cycle from months to weeks. The chip also provides 10x faster storage access compared to its predecessor, and its Virgo Network — combined with JAX and Pathways software — enables near-linear scaling for up to one million chips in a single logical cluster.</p><p>The <strong>TPU 8i</strong> inference chip delivers <strong>80% better performance than Ironwood</strong> and features 384 megabytes of on-chip SRAM — triple the amount found in the Ironwood generation. It pairs that SRAM with 288 GB of high-bandwidth memory and doubled Interconnect (ICI) bandwidth of 19.2 Tb/s, optimized for Mixture of Experts (MoE) models. A single TPU 8i pod can hold 1,152 chips, up significantly from 256 chips in an Ironwood pod.</p><p>Both chips deliver <strong>twice the performance-per-watt improvement</strong> over the previous generation, according to SiliconANGLE.</p><h2>Why Google Is Splitting Training and Inference Into Separate Chips</h2><p>The decision to bifurcate the eighth-generation TPU reflects a broader industry recognition that training and inference are fundamentally different computational workloads — and that a single chip optimized for both may serve neither particularly well.</p><p>Training large AI models demands maximum raw compute power and the ability to scale across thousands of chips simultaneously. Inference — running an already-trained model to respond to user queries in real time — requires low latency, large fast-access memory, and efficient chip-to-chip communication. As AI agents and chatbots generate exponentially more inference demand, the hardware requirements for inference have grown distinct enough to justify dedicated silicon.</p><p>Amin Vahdat, Google's senior vice president and chief technologist for AI and infrastructure, explained the rationale directly: <em>"With the rise of AI agents, we determined the community would benefit from chips individually specialized to the needs of training and serving."</em></p><p>Jeff Dean, Google Chief Scientist, echoed the logic: <em>"As demand grows for quickly processing AI queries, it now becomes sensible to specialize chips more for training or more for inference workloads."</em></p><p>The inference market has become particularly contested. Gartner analyst Chirag Dekate put it plainly: <em>"The battleground is shifting towards inference."</em></p><h2>Major Customers and Strategic Partnerships Backing Google's Chip Ecosystem</h2><p>Google's TPU ambitions are buttressed by a growing roster of major enterprise and AI customers — and a long-term manufacturing partnership that extends well into the decade.</p><p>According to Bloomberg reporting via Yahoo Finance and Business Standard, <strong>Meta Platforms signed a multibillion-dollar deal</strong> to use TPUs through Google Cloud over several years. Meta's head of infrastructure, Santosh Janardhan, acknowledged the potential advantages of the new chips while maintaining measured expectations: <em>"It does look like there might be inference advantages,"</em> he said, while also cautioning that <em>"no new platform is without hurdles and a learning curve."</em></p><p>Anthropic, the AI safety company, has separately expanded its TPU access to up to <strong>one million TPU chips</strong>, with well over a gigawatt of capacity coming online in 2026, according to a Google Cloud press release from October 23, 2025. Thomas Kurian, CEO at Google Cloud, noted: <em>"Anthropic's choice to significantly expand its usage of TPUs reflects the strong price-performance and efficiency its teams have seen with TPUs for several years."</em> Anthropic has also signed a separate deal with Broadcom — Google's TPU manufacturing partner — for chips enabling roughly 3.5 gigawatts of computing power starting in 2027.</p><p>On the enterprise side, CNBC reported that Citadel Securities built quantitative research software using Google's TPUs, and all 17 U.S. Energy Department national laboratories use AI co-scientist software built on the chips.</p><p>Underpinning all of this is Google's renewed long-term agreement with <strong>Broadcom</strong> to develop and supply custom AI chips, including future TPU generations, through 2031.</p><h2>Context: Google's Position in the AI Chip Market</h2><p>Google has been developing TPUs for over a decade, positioning them as an alternative to Nvidia's GPUs, which remain the dominant hardware choice for training frontier AI models. Despite that dominance, Google has not abandoned Nvidia — the company continues to offer Nvidia GPUs alongside TPUs to its cloud customers, and CNBC noted that Google remains a large Nvidia customer. Importantly, Google did not directly compare its new chip performance to Nvidia's in its announcements.</p><p>The scale of Google's chip business is significant regardless. DA Davidson analysts, as cited by CNBC, estimated the TPU business combined with the Google DeepMind AI group to be worth approximately <strong>$900 billion</strong>.</p><p>On the demand side, Google's own first-party models now process more than <strong>16 billion tokens per minute</strong> via direct API use, up from 10 billion the previous quarter, according to Sundar Pichai's official Cloud Next blog post. That trajectory — and the infrastructure required to support it — helps explain why Google is investing heavily in purpose-built inference silicon.</p><h2>Expert Reactions: Cautious Optimism and Acknowledged Complexity</h2><p>The verified expert commentary around Google's new chips reflects both genuine enthusiasm and measured realism. Meta's Santosh Janardhan flagged potential inference advantages while acknowledging the transition challenges any new platform brings. Gartner's Chirag Dekate framed the broader competitive shift succinctly around inference. And within Google, Paul Barham, a Google distinguished scientist who co-leads the Gemini infrastructure team, referenced the complexity of building reliable large-scale systems — noting that distributed AI infrastructure, if not engineered carefully, can <em>"completely self-destruct."</em></p><p>These reactions collectively suggest that while the TPU 8t and 8i represent meaningful technical progress, adoption at scale — particularly for customers migrating from Nvidia's ecosystem — will involve real engineering work.</p><h2>What Comes Next for Google's TPU Strategy</h2><p>Google has not announced pricing details for the TPU 8t or TPU 8i beyond the performance-per-dollar comparison to Ironwood. Availability timelines for general Google Cloud customers beyond the announced major partnerships have not been specified in the research available at time of publication.</p><p>What is clear is the strategic direction: Google is building a specialized, vertically integrated AI hardware stack that spans custom silicon (in partnership with Broadcom through 2031), software frameworks (JAX, Pathways), and cloud delivery — all optimized for the workloads that are driving AI costs and capabilities in 2026. The TPU 8i's architecture, in particular, appears designed to capture a growing share of the inference market as AI agent usage scales globally.</p><p>Whether Google can close the gap with Nvidia — especially for training workloads where Nvidia's ecosystem lock-in remains strong — will depend not just on chip specifications but on software maturity, developer tooling, and the real-world experiences of early adopters like Meta and Anthropic.</p><p>For more tech news, visit our <a href=\"/news\">news section</a>.</p><h2>The Productivity Angle: Why AI Chip Progress Matters Beyond the Data Center</h2><p>For health and productivity-focused professionals, the implications of faster, more efficient AI inference chips are tangible. Every AI tool that helps you manage your schedule, synthesize research, or support your wellbeing runs on inference infrastructure. As chips like the TPU 8i reduce latency and cost, AI-powered productivity applications become faster, cheaper, and more capable — directly affecting the quality of tools available to optimize your daily performance. Staying informed about the infrastructure powering these applications puts you ahead of the curve. Join the <a href=\"/#waitlist\">Moccet waitlist</a> to stay ahead of the curve.</p>", "excerpt": "Google announced at Cloud Next 2026 in Las Vegas that it is splitting its eighth-generation TPU into two purpose-built chips: the TPU 8t for AI model training and the TPU 8i for inference. The TPU 8t delivers 2.8 times the performance of the previous Ironwood generation for the same price, while the TPU 8i targets the fast-growing inference market with 3x more on-chip SRAM. Major customers including Meta Platforms and Anthropic have already signed significant deals to use the new chips through Google Cloud.", "keywords": ["Google TPU 8t", "Google TPU 8i", "AI chips 2026", "Google Cloud Next 2026", "inference chip Nvidia competitor"], "slug": "google-tpu-8t-8i-ai-chips-cloud-next-2026" } ```