LLM Lawsuit

Sarah Silverman Headlines Lawsuit Accusing OpenAI and Meta of Being ‘Indus­trial-Strength Pla­gia­rists’

Comedian Sarah Silverman and other writers are suing OpenAI and Meta, claiming they illegally used copyrighted material to train their generative AI models ChatGPT and LLaMA. Silverman and authors Christopher Golden and Richard Kadrey claim the companies violated intellectual property law to build their respective large language models.


The lawsuit against Meta and OpenAI is centered on the information used to train their LLMs. The complaint alleges that books by the authors were among many copy­righted works in the training datasets. Meta and OpenAI did not credit or pay them and did not have permission from the copyright holders to use their work for that purpose. This would violate unfair competition laws and the Digital Millennium Copyright Act, according to the filing.  Silverman, Golden, and Kadrey are represented in the lawsuit by Joseph Saveri and Matthew Butterick at the Joseph Saveri Law Firm. The lawyers have a previous class action suit against OpenAI on behalf of authors Paul Tremblay and Mona Awadand have even set up a website designed for non-lawyers to understand the cases as they see it.

We’ve filed law­suits chal­leng­ing Chat­GPT and LLaMA, indus­trial-strength pla­gia­rists that vio­late the rights of book authors,” Saveri and Butterick wrote in their introduction to the website. “Because AI needs to be fair & eth­i­cal for every­one.”

As evidence, the lawsuit points to how ChatGPT can summarize the books written by the authors, and its supposed training on BookCorpus, ta training dataset that included copyrighted material. The lawyers also found the plaintiffs’ works among the book pirating websites scraped for ‘ThePile,’ one of the datasets Meta has acknowledged it used to train LLaMA. The lawsuit asks for jury trials and wants the court to issue injunctions that could require OpenAI and Meta to make major changes to ChatGPT and LLaMA.

“Since the release of OpenAI’s Chat­GPT sys­tem in March 2023, we’ve been hear­ing from writ­ers, authors, and pub­lish­ers who are con­cerned about its uncanny abil­ity to gen­er­ate text sim­i­lar to that found in copy­righted tex­tual mate­ri­als, includ­ing thou­sands of books,” Saveri and Butterick wrote. “[W]e’ve filed a class-action law­suit against OpenAI chal­leng­ing Chat­GPT and its under­ly­ing large lan­guage mod­els, GPT-3.5 and GPT-4, which remix the copy­righted works of thou­sands of book authors—and many oth­ers—with­out con­sent, com­pen­sa­tion, or credit.


The debate over how intellectual property rules apply to synthetic media and generative AI has exploded this year. Lawsuits and attempts to avoid them are a consistent aspect of any new LLM or synthetic media tool. Getty Images has a suit against Stability AI over whether its text-to-image model, Stable Diffusion, breaks those rules. And the Saveri Law Firm popped up earlier this year representing a group of artists in a class action lawsuit against Stability AI, along with Stable Diffusion platforms Midjourney and DeviantArt with the same general complaint. In both cases, the issues arise from the copyrighted images amongst the billions of pictures used to train Stable Diffusion. That includes the open-source LAION-5B dataset AI model and the images Stability scraped from the web, including Getty’s servers, without their creators’ awareness.

Companies looking to skip the courtroom drama are circumscribing their training and sometimes backing it up with their checkbook. Both Shutterstock and Adobe have independently said that if a client’s use of their generative AI tools leads to accusations of copyright violation, they will take up the legal costs. The point is that they are confident that their respective synthetic media generators don’t violate any IP rules.

Shutterstock Follows Adobe in Offering Legal Protection For Any Generative AI-Derived Synthetic Media

Getty Images Sues Stability AI for Generative AI Art’s Alleged Copyright Violations

Artists File Class Action Lawsuit Against Stability AI, Midjourney, DeviantArt Over Generative AI Art