New York Times Says OpenAI Erased Potential Lawsuit Evidence

Lawsuits are never exactly a lovefest, but the copyright fight between The New York Times and both OpenAI and Microsoft is getting especially contentious. This week, the Times alleged that OpenAI’s engineers inadvertently erased data the paper’s team spent more than 150 hours extracting as potential evidence.

OpenAI was able to recover much of the data, but the Times’ legal team says it’s still missing the original file names and folder structure. According to a declaration filed to the court Wednesday by Jennifer B. Maisel, a lawyer for the newspaper, this means the information “cannot be used to determine where the news plaintiffs’ copied articles” may have been incorporated into OpenAI’s artificial intelligence models.

“We disagree with the characterizations made and will file our response soon,” OpenAI spokesperson Jason Deutrom told WIRED in a statement. The New York Times declined to comment.

The Times filed its copyright lawsuit against OpenAI and Microsoft last year, alleging that the companies had illegally used its articles to train artificial intelligence tools like ChatGPT. The case is one of many ongoing legal battles between AI companies and publishers, including a similar lawsuit filed by the Daily News being handled by some of the same lawyers.

The Times’ case is currently in discovery, which means both sides are turning over requested documents and information that could become evidence. As part of the process, OpenAI was required by the court to show the Times its training data, which is a big deal—OpenAI has never publicly revealed exactly what information was used to build its AI models. To disclose it, OpenAI created what the court is calling a “sandbox” of two “virtual machines” that the Times’ lawyers could sift through. In her declaration, Maisel said that OpenAI engineers had “erased” data organized by the Times’ team on one of these machines.

According to Maisel’s filing, OpenAI acknowledged that the information had been deleted, and attempted to address the issue shortly after it was alerted to it earlier this month. But when the paper’s lawyers looked at the “restored” data, it was too disorganized, forcing them “to recreate their work from scratch using significant person-hours and computer processing time,” several other Times lawyers said in a letter filed to the judge the same day as Maisel’s declaration.

The lawyers noted that they had “no reason to believe” that the deletion was “intentional.” In emails submitted as an exhibit along with Maisel’s letter, OpenAI counsel Tom Gorman referred to the data erasure as a “glitch.”

This is not the first dispute of its kind in the lawsuit. Over the past year, the Times and the tech companies have been fighting over which party should be responsible for sorting through the training data. In its most recent letter, the paper’s lawyers asserted again that OpenAI is in the better position to do it. “The process has not gone smoothly,” Steven Lieberman, another lawyer for the Times, wrote in a filing earlier this month, in which he claimed that “severe and repeated technical issues have made it impossible to effectively and efficiently search across OpenAI’s training datasets in order to ascertain the full scope of OpenAI’s infringement.”

The Times also recently pushed OpenAI and Microsoft to provide Slack messages, text messages, and social media conversations between a number of key OpenAI figures, including former employees like Ilya Slutskever and current executives like Brad Lightstone. Last week, the New York Times filed another letter asking the court to compel Microsoft and OpenAI to share additional materials. One exhibit featured emails that showed former OpenAI CTO Mira Murati had refused to “provide access” to her personal cell phone.

Meanwhile, Microsoft has requested that The New York Times turn over any documents related to its own use of generative AI. In a filing, it specifically cited star tech columnist Kevin Roose (referred to as “Kevin Rouse” in the court filings). Microsoft argued that information about how the Times uses AI tools could be relevant to its defense in a number of ways, including showing that they have had a positive impact on the newspaper. Roose declined to comment.

As this case and others like it wind their way through the courts, OpenAI is pursuing content licensing deals with other publishers, including The Atlantic, Axel Springer, Vox Media, and WIRED parent company Condé Nast. There’s no consensus within the media and legal worlds about how these cases will shake out. But either way, they will set a major precedent for how the AI industry can operate in the United States.

Related Posts

The Tesla Cybercab Seems Like An Even Worse Idea In Person

Travis Langness/SlashGear After years of broken promises, delayed launch events, and empty platitudes from Tesla about their cars, their car’s capabilities, and what is possible with features like “Autopilot,” I’ve…

Read more

Track Your Health With Garmin’s Venu 2 Smartwatch, Now at Its Lowest Price for Black Friday

Just in time for planning your New Year’s resolutions, the Garmin Venu 2 GPS Smartwatch is now 42% off for Black Friday, bringing it to its lowest price ever at $150….

Read more

Best VPN Deals: Keep Yourself Safe Online for Less Than $2 a Month

Show more (3 items) The best virtual private networks can cost you as little as $2 a month (sometimes even less), and they offer additional online security for an internet…

Read more

Lowest Price This Year: Govee’s Smart Holiday String Lights With Endless Effects

If you’re looking to deck the halls this holiday season while outdoing all your neighbors, Govee’s Smart Holiday String Lights with a shape mapping function are at the lowest price…

Read more

Snap says New Mexico intentionally friended alleged child predators, then blamed the company

/ The New Mexico AG’s office mischaracterized their own investigation into the company, Snap says. p>span:first-child]:text-gray-13 [&_.duet–article-byline-and]:text-gray-13″> By Lauren Feiner, a senior policy reporter at The Verge, covering the intersection…

Read more

The best iPhones

Whether you want a battery that lasts for days or the very best deal, we’ve got some recommendations for an iPhone you’ll love. By Allison Johnson, a reviewer with 10…

Read more

Leave a Reply