Tools Notable

AMD's Halo Box Puts 128GB of RAM in a Mini PC Built for Local AI Models

April 30, 2026 2 min read

128GB of unified memory in a mini PC. AMD's Halo Box, built around the Ryzen AI Max+ 395 processor, is aimed squarely at users who want to run large AI language models on their own hardware without buying server equipment.

The unified memory spec is the key detail here. In a traditional desktop setup, the GPU has its own separate memory pool - usually 8GB to 24GB on consumer graphics cards. AI language models load entirely into this memory, which is why most people can only run small or heavily compressed models locally. Unified memory means the CPU and RAM share the same pool. At 128GB, you can load a 70B parameter model (like Meta's Llama 3.1 70B) at full quality, with room to spare. Push into quantized versions - compressed formats that trade a small amount of quality for much smaller file sizes - of larger models and you're looking at 405B-scale territory.

The Competition Is Apple

The obvious comparison is Apple's M-series chips, which have used the same unified memory architecture for years. An M4 Max MacBook Pro with 128GB currently runs $4,000-$5,000. AMD hasn't announced Halo Box pricing yet, but if it comes in meaningfully cheaper and real-world performance holds up, it gives Windows and Linux users a genuine local AI machine without paying Apple's premium.

A demo unit has been spotted running Ubuntu, which makes sense - most open-source model serving tools (like AnythingLLM) are built Linux-first. The Ryzen AI Max+ 395 also includes 50 TOPS (tera operations per second) of dedicated AI compute built directly into the chip.

The real question is inference speed on actual workloads - how fast the model generates text in practice, not on a spec sheet. Apple's neural engine has years of software optimization behind it. AMD's local AI compute story on consumer silicon is newer territory, and production benchmarks will matter far more than announcement photos.

The Competition Is Apple

Related Tools

More from today

OpenAI Adds Opt-In Security Mode for High-Risk ChatGPT and Codex Accounts

Stripe's Link Wallet Now Lets AI Agents Spend Money on Your Behalf

Google Replaces Google Assistant with Gemini in Cars with Google Built-In

Cookie Preferences