Login | January 18, 2025

Recent court decisions re: the DMCA and AI web scraping and extraction

RICHARD WEINER
Technology for Lawyers

Published: January 17, 2025

Here is another installment of “old laws applied to new tech.”
When several news organizations sued Open AI for violating copyright laws through their use of web scraping, some of the complaints also alleged violations of the Digital Millenium Copyright Act (DMCA), the late 2990’s addition to copyright laws that specifically added certain vagaries of electronic media to protected status.
Because of this, the court decisions in these cases covered regular copyright violation and DCMA violations.
The complaints focused on whether the AI training resulted in the removal of copyright management information, (CMI) in violation of the DMCA. Section 1202(b)(1) of the DMCA, which was enacted in 1998, prohibits the intentional removal or alteration of CMI where the removing party knows or has reasonable grounds to know, that it will induce, enable, facilitate or conceal copyright infringement.
The statute was enacted because certain identifiers were being removed for bandwidth considerations, but the law is still on the books, and now is getting applied to AI scraping.
Two recent cases that included allegations of CMI infringement under the DCMA are Raw Story Media, Inc. v. OpenAI and The Intercept Media, Inc. v. OpenAI, Inc.
We noted that these cases were filed some time back and now here they are.
Raw Story (and AlterNet Media) brough suit in SDNY federal court against OpenAI and associated organizations alleging “thousands” of violations of 1202(b)(1).
Defendants moved to dismiss the case because of a lack of provable harm.
The judge sided with the defendants and dismissed the case stating that there was no specific Article III harm to the plaintiffs, and that 1202(b)(1) covers only the integrity of the CMI and not how it is used. (IDK, personally).
The Intercept brought had a different result with a different judge in the same court.
The Intercept’s DCMA case survived OpenAI’s motions to dismiss.
Here the plaintiff alleged that the scraped copyrighted data didn’t include the identifying CMI.
The plaintiff gave three examples of their scraped data showing up in the response to a prompt without identifying CMI.
OpenAI made the same arguments in both cases, in the same court, with two different results.
Somehow that about sums up where the law stands on AI at the moment.
Thanks for the analyses to the folks at Skadden Arps, via Lexology.


[Back]