Chief Lawyer Xu Xinming

+86-13910160652 ciplawyers@163.com

About Us Consultation

Working Progress

more >

Forging Sword for Seven Years: Won the Case of the Dispute over Invalidation of the Invention Patent of Yee Fung Handled By Lawyer Xu Xinming
Xu Xinming: Can the Trademark Right of Zhengongfu Beat Bruce Lee’s Portrait Right?
Lasted for Eight Years: Won the Case Concerning the Administrative Dispute over Invalidation of the Invention Patent of Elecon Handled by Lawyer Xu Xinming
IMECAS v. Intel Series Cases of Patent Infringement Disputes
Winning the Case Concerning Dispute over Copyright Transfer Contract of the Musical “Race for Love” Represented by Lawyer Xu Xinming

IP Express

more >

Judicial Development

more >

CASE

more >

IP Theory

more >

IP Practice&View

more >

Trade secrets: When does the statute of limitations begin to run?

02-05 2025

legal System

more >

Chinese Law Library

International Law Library

Return to List

Home > AI

OpenAI sued for copyright infringement

Post time：07-05 2023 Source：europa.eu

tags： copyright infringement ChatGPT OpenAI

font-size: +-

563

On Wednesday 28 June, US authors Paul Tremblay and Mona Awad (the plaintiffs) filed a class action complaint in the San Francisco federal court against OpenAI, for copyright infringement when training its auto-generative artificial intelligence system known as ChatGPT. The proposed class action alleges copyright infringement, violations of the Digital Millennium Copyright Act, unjust enrichment and negligence, among other claims, on behalf of themselves and all others similarly situated.

As we already know, Chat GPT is an auto-generative chat that extracts data from different sources and then processes it using Natural Language Processing (NLP). Since the launch of ChatGPT, there has been a lot of discussion about its relationship with intellectual property, specifically with copyright: when is the output inspired from existing works and when is it actually infringing them? We discussed it in this post.

Generative AI companies are facing a barrage of numerous legal actions. Earlier this year, Getty Images sued the company Stability IA for training on millions of its pictures without consent. The proposed class action filed in the San Francisco federal court last Wednesday is based on the claim that OpenAI infringed copyright at two points: first, when it illegally downloaded copies of novels to train its artificial intelligence system, and second, because ChatGPT's responses (output) are themselves infringing the rights in such works.

As to the first issue.

The plaintiffs alleged that much of the material in OpenAI's training datasets comes from copyrighted works, including books, which were copied by OpenAI without consent, without credit, and without compensation. Books have always been a key ingredient in training datasets for large language models, as they provide the best examples of high-quality extensive writing. In the July 2020 GPT-3 paper, OpenAI revealed that 15% of GPT-3's huge training dataset came from "two internet-based book corpora", which can be estimated at around 300,000 titles. The plaintiffs claimed that the only internet-based book corpuses that have ever offered so much material are the notorious "shadow library" websites, which are blatantly illegal.

As evidence of infringement, the plaintiffs argued that when ChatGPT was asked to summarise the books written by each of them it generated very accurate summaries and that the reason it could do it is because the books were copied by OpenAI and ingested by the language model as part of its training data. The two authors alleged that OpenAI made copies of their books during the process of training OpenAI’s language models without their permission. Therefore, they sought damages and restitution of profits.

As to the second issue.

The plaintiffs argued that because the output of the OpenAI Language Models is based on expressive information extracted from the plaintiffs' works, each output of the OpenAI Language Models is an infringing derivative work, without permission from the authors and in violation of their exclusive rights. They alleged that OpenAI has benefited economically from the infringing results of the OpenAI Language Models as each result of the auto-generative chat constitutes an act of contributory copyright infringement. They also sought damages and restitution of profits.

This class action figures in around 300.000 books that could have been victims of plagiarism and seeks to represent the hundreds of thousands of US authors whose copyrights may have been infringed — in many of these cases, through websites that offer this content illegally.

Also on Wednesday, another class-action suit was filed against OpenAI in the California federal court by Clarkson, a public-interest law firm, on behalf of anonymous clients. They accuse OpenAI of stealing and misappropriating vast swathes of personal data from the Internet.

Previous Next Next

Chief Lawyer Xu Xinming

Working Progress

IP Express

Judicial Development

CASE

The Administrative Dispute over Invalidation of the Invention Patent of Elecon

JUVE Patent’s top 10 patent cases in Europe 2024

Rolex trademark infringement case in Kazakhstan: Court should have considered consumer perception

IP Theory

IP Practice&View

Trade secrets: When does the statute of limitations begin to run?

legal System

Chinese Law Library

International Law Library

OpenAI sued for copyright infringement

Relate Articles

Baidu sued for copyright infringement

Kanye West sued for Copyright Infringement (again)

Disney sued for copyright infringement over Moana

Beyonce sued for $7.11 million for copyright infringement regarding song 'XO'

Intel sued for copyright infringement over AI software

Comment