



Seasoned authors Paul Tremblay and Mona Awad have turned to the U.S. District Court in California with allegations against artificial intelligence firm OpenAI.
The authors claim the company violated copyright laws by utilizing their literary works to train its sophisticated language model, ChatGPT, without acquiring explicit consent.
As described in the filed complaint, ChatGPT, an advanced AI language model, builds its capabilities by absorbing and analyzing immense volumes of text, distilling meaningful data, and ultimately compiling what is referred to as a “training dataset”. The process, however, has brought up serious legal implications.
According to the lawsuit, neither Tremblay nor Awad, both Massachusetts-based authors, granted OpenAI permission to use their copyrighted works as fodder for ChatGPT’s training. Despite this, the training process allegedly involved the integration of their literary content.
Tremblay, the creator of books like “The Cabin at the End of the World,” and Awad, author of works including “13 Ways of Looking at a Fat Girl” and “Bunny,” both hold registered copyrights for their respective works.
As the complaint outlines, “ChatGPT generates summaries of Plaintiffs’ copyrighted works — something only possible if ChatGPT was trained on Plaintiffs’ copyrighted works.”
The authors argue that OpenAI reaps significant commercial and profit gains from using their copyrighted materials through ChatGPT.
“Indeed, when ChatGPT is prompted, ChatGPT generates summaries of Plaintiffs’ copyrighted works — something only possible if ChatGPT was trained on Plaintiffs’ copyrighted works,” the complaint says. “Defendants, by and through the use of ChatGPT, benefit commercial and profit richly from the use of Plaintiffs’ and Class members’ copyrighted materials.”
The lawsuit refers to a June 2018 publication in which OpenAI unveiled its training of the GPT-1 model on BookCorpus, an assortment of “over 7,000 unique unpublished books from a variety of genres, including Adventure, Fantasy, and Romance.”
The paper noted the high value of using a dataset of books because it contains extended, uninterrupted passages that help the AI model to “condition on long-range information.”
The document goes on to state that “Hundreds of large language models have been trained on BookCorpus, including those made by OpenAI, Google, Amazon, and others,” thereby raising critical questions about the potential infringements of copyright laws in the AI training process.
RELATED: PETA Utilizes AI to Rewrite The Bible to Pervert It’s Message
According to Andres Guadamuz, an intellectual property law expert at the University of Sussex, this lawsuit is the first of its kind against OpenAI concerning copyright law.
Joseph Saveri and Matthew Butterick, legal counsels representing Tremblay and Awad, assert that books make ideal training tools for large language models due to their “high-quality, well-edited, long-form prose,” which effectively serve as “the gold standard of idea storage for our species.”
“Defendants breached their duties by negligently, carelessly, and recklessly collecting, maintaining and controlling Plaintiffs’ and Class members’ Infringed Works and engineering, designing, maintaining and controlling systems — including ChatGPT — which are trained on Plaintiffs’ and Class members’ Infringed Works without their authorization,” the complaint says.
The lawsuit is seeking an award of statutory and additional damages. Fox News Digital made attempts to reach out to OpenAI for a comment on the issue, but as of the time of reporting, no response had been received.
It is likely that there will be other cases of copyright infringement in the artificial intelligence world given the immense amount of material used to “train” it.
RELATED: Criminals Exploit AI to Create Child Pornography, Blackmail Teens – Report
