Meta Unveils Four Custom AI Chips, Plans Full Deployment by End of 2027

AI news: Meta Unveils Four Custom AI Chips, Plans Full Deployment by End of 2027

Four chip generations in two years. That's the pace Meta just committed to with its custom MTIA silicon program, and it signals a real shift in how Big Tech thinks about AI infrastructure costs.

Meta announced four new chips in its MTIA (Meta Training and Inference Accelerator) family, all manufactured by TSMC in partnership with Broadcom. The first, MTIA 300, is already running in production. The last, MTIA 500, is slated for mass deployment by mid-2027. A new generation roughly every six months.

What the Four Chips Actually Do

The MTIA 300 handles training for ranking and recommendation models, the systems that decide what appears in your Facebook and Instagram feeds. It's the only training chip of the bunch.

The remaining three, MTIA 400 (codenamed Iris), MTIA 450 (Arke), and MTIA 500 (Astrid), are all optimized for generative AI inference. That means they're built to run already-trained models quickly and cheaply, not to train new ones from scratch. Think: generating images from text prompts, powering Meta AI assistant responses, creating videos.

The performance jump across generations is steep. MTIA 300 delivers 6.1 TB/s of memory bandwidth. MTIA 500 hits 27.6 TB/s, a 4.5x increase. Compute performance scales even more aggressively, from 1.2 petaflops on the 300 to 30 petaflops on the 500 in MX4 format (a low-precision number format that's become standard for running AI models efficiently).

Meta claims the MTIA 400 already delivers "raw performance competitive with leading commercial products," a not-so-subtle reference to Nvidia's GPUs. The MTIA 450's memory bandwidth, Meta says, exceeds what's currently available from any off-the-shelf chip.

This Isn't a Nvidia Breakup

The timing is notable. Less than a month before this announcement, Meta agreed to buy billions of dollars worth of Nvidia processors. In February, it signed a $1 billion deal with AMD. The company has guided $115 to $135 billion in capital expenditure for 2026 alone.

So why build custom chips at all? Meta VP of Engineering Yee Jiun Song put it plainly: "We're not building for the general market, so our chips don't need to be as general purpose. We can cut out things we don't need, which really allows us to drive down cost."

Nvidia GPUs are designed as general-purpose accelerators, optimized first for training and then adapted for inference. Meta's MTIA chips flip that priority. They're inference-first, purpose-built for Meta's specific workloads. The company already has hundreds of thousands of MTIA chips deployed in production.

The chiplet-based architecture is the clever engineering detail here. Each new MTIA generation drops into the same physical chassis, rack, and network infrastructure as its predecessor. That means Meta can upgrade compute without ripping out and replacing its data center hardware, a massive cost and logistics advantage when you're operating at this scale.

The Bigger Picture for AI Costs

Meta, Google (with TPUs), Amazon (with Trainium and Inferentia), and Microsoft (with Maia) are all building custom silicon. The pattern is clear: running AI inference at scale on general-purpose GPUs is too expensive for companies serving billions of users.

For the rest of us who use these AI products daily, the downstream effect is straightforward. Cheaper inference means companies can offer more AI features without raising subscription prices, or they can improve model quality without proportionally increasing costs. Meta's 40 million daily Meta AI users are the direct beneficiaries.

Meta hasn't published specific dollar savings figures, which is typical for custom chip programs. But the trajectory from MTIA 300 to MTIA 500, a 25x compute improvement across four generations in about two years, suggests the cost-per-inference is about to drop significantly for every AI feature Meta ships.