Policy Notable

Major Publishers Sue Meta Over Llama AI Training Data

May 5, 2026 2 min read

Image: Meta

Five major publishers - Macmillan, McGraw-Hill, Elsevier, Hachette, and Cengage - have filed a class action lawsuit against Meta, alleging the company used millions of copyrighted books to train its Llama AI models without permission or payment. One author has joined the suit. The plaintiffs describe the alleged conduct as "one of the most massive infringements of copyrighted materials in history."

Meta's Llama models are the dominant open-source AI systems used across the industry. Training an AI model involves feeding it enormous amounts of text so it can learn language patterns and generate coherent responses. The publishers' claim is that their books were included in that training data without authorization.

This case stands out from other AI copyright suits because of who's filing. These aren't individual authors acting on principle. Macmillan, McGraw-Hill, Elsevier, Hachette, and Cengage collectively represent a major portion of global book publishing, and they have the legal budget and resources to pursue this through a full trial.

Books also present a harder target for fair use arguments than scraped web text. Each book is a distinct, individually authored work with a named copyright holder - making it far easier to demonstrate which specific works were taken and who owns them. The New York Times filed similar claims against OpenAI, maker of ChatGPT, though that case remains unresolved. If Meta's defense relies on fair use doctrine - the legal provision that allows limited use of copyrighted material under certain conditions - it will need to convince a court that building a commercial AI model on millions of copyrighted books qualifies. No court has definitively ruled on that question yet.

A ruling against Meta would reach well beyond this lawsuit. Every major AI lab would need to rethink where its training data comes from.

Related Tools

More from today

Pennsylvania AG Sues Character.AI After Chatbot Falsely Claimed to Be a Licensed Psychiatrist

Pennsylvania Sues AI Company for Chatbots That Claimed to Be Licensed Doctors

Meta Is Using AI to Scan Physical Features to Spot Underage Users

Cookie Preferences