AI lawsuits worth watching: A curated guide

Image by Mohamed Hassan from Pixabay

This Transparency Coalition guide is co-published with Tech Policy Press, a nonprofit media venture dedicated to promoting new ideas, debate and discussion at the intersection of technology and democracy.

As policymakers work to create safeguards around artificial intelligence, dozens of AI-related lawsuits are simultaneously working their way through state and federal courts. It may be months or years before those legal actions are fully heard, but early procedural rulings are likely to influence the laws that emerge around training data, copyright, data privacy, and other issues.

The hailstorm of AI-related lawsuits over the past nine months can make the litigation space feel chaotic and confusing. In fact, the lawsuits can be roughly sorted into two buckets: copyright infringement and harmful AI-driven outcomes.

Illustration by Sasha Beck

Today most of the fury and passion is coming out of the copyright fight. Both sides frame this as an existential struggle. Authors, artists, and institutions like The New York Times argue that their very livelihoods are under assault by powerful billion-dollar AI models like OpenAI’s ChatGPT chatbot or Stability’s Stable Diffusion image generator. OpenAI officials told a UK House of Lords committee earlier this year that “it would be impossible to train today’s leading AI models without using copyrighted materials.” Notably, OpenAI did not say it would be impossible to train AI models without paying to license copyrighted materials—which OpenAI is now doing via data deals with Reddit, The Financial Times, Vox Media, and others.

Lawsuits claiming harm from AI-driven outcomes—wrongful arrest, denial of housing or employment—are largely single-individual cases. Most involve wrongful arrest driven by AI facial recognition systems, digital redlining in housing or employment, or defamation via AI hallucination. Although they receive far less media attention, they may establish influential precedents on the question of AI liability, privacy harms, and algorithmic discrimination.

Credit where it’s due: The team at The George Washington University Law School’s Ethical Technology Initiative maintains an AI Litigation Database that includes select past cases that may influence current rulings. The Free Law Project also maintains an updated roster of AI cases, while their CourtListener initiative provides immediate access to case documents. We’ve drawn upon their work to create this curated list.

Quick links to lawsuit subjects
Copyright Infringement:
Journalism Books Images Music Personal Data
Harmful Outcomes:
Wrongful Arrest Digital Redlining (Housing) Digital Redlining (Employment) Libel
AI Transparency

copyright infringement lawsuits 

A handful of actions are emerging as the big-fish lawsuits. The New York Times and Alden newspapers (NY Daily News, Chicago Tribune, et al), separately, are pursuing actions against OpenAI and Microsoft for the unauthorized use of their text as training data. The Authors Guild is suing OpenAI for the unauthorized use of copyright-protected book text as training data. Getty Images is suing Stability AI for unauthorized use of Getty-owned images. Universal and other music publishers are suing Anthropic for unlawful use of their intellectual property as training data.

The two biggest federal copyright lawsuits, filed by The New York Times and the Author’s Guild, are being overseen by Judge Sidney H. Stein of the U.S. District Court for the Southern District of New York.

As with most cases, jurisdiction and judge matter. And there’s a bit of an East Coast-West Coast rap battle emerging here. A number of AI-related copyright cases have been filed in the Northern District of California, home of Silicon Valley and most of the nation’s largest tech and AI companies. Most have been filed in the Southern District of New York (SDNY), home of the nation’s publishing industry and some of the world’s most powerful media companies. So far the Northern California judges have been more favorable to the AI developers than the New York judges in their rulings.

SDNY’s Judge Stein has quickly emerged as a powerful arbiter, consolidating a number of disparate cases into the Author’s Guild lawsuit while keeping open the possibility of doing the same with the Times and various journalism-based cases. Stein is no newcomer to copyright law. Earlier this year he oversaw the settlement of a much-watched art world case brought against the artist Richard Prince by two photographers whose images he had incorporated into his work. Judge Stein steered the Prince case into a settlement (Prince paid the original photographers for use of their work), which is worth keeping in mind. With so much at stake it may be in everyone’s interest to avoid rolling the dice on a decision—sure to be appealed all the way up—that devastates one side while setting precedent for decades to come.  

Image by Mahesh Patel from Pixabay

Copyright infringement (journalism)

 The New York Times v. Microsoft and OpenAI

Filed Dec. 7, 2023. Jurisdiction: Federal, Southern District of New York (SDNY). In the most widely watched case to date, the New York Times accuses OpenAI and Microsoft of unlawfully using The Times’ journalism as training data to create AI products that compete with The Times itself and threaten the company’s ability to provide that journalism. Per the complaint:

Defendants’ generative artificial intelligence (“GenAI”) tools rely on large-language models (“LLMs”) that were built by copying and using millions of The Times’s copyrighted news articles, in-depth investigations, opinion pieces, reviews, how-to guides, and more. While Defendants engaged in widescale copying from many sources, they gave Times content particular emphasis when building their LLMs—revealing a preference that recognizes the value of those works. Through Microsoft’s Bing Chat (recently rebranded as “Copilot”) and OpenAI’s ChatGPT, Defendants seek to free-ride on The Times’s massive investment in its journalism by using it to build substitutive products without permission or payment. 

In response, OpenAI claims that its input of Times articles constitutes fair use of that material. “Our position can be summed up in these four points,” the company says on its website. To quote OpenAI:

1.         We collaborate with news organizations and are creating new opportunities.

2.         Training is fair use, but we provide an opt-out because it’s the right thing to do.

3.         ‘Regurgitation’ is a rare bug that we are working to drive to zero.

4.         The New York Times is not telling the full story.

 Training is fair use: Here lies the crux of the issue, which we expect eminent jurists—up to and including the Supreme Court—to struggle with over the coming months and years. OpenAI wants LLMs like ChatGPT to be considered learning entities, like high school students who read Times articles as source material prior to writing an essay in their own words. The Times rebuts that idea by offering (in the lawsuit’s Exhibit J) hundreds of examples of ChatGPT “regurgitating” full New York Times articles word-for-word. The Times contends that ChatGPT should be considered the world’s most powerful plagiarist and intellectual property thief.

Other infringement (journalism) cases worth watching

Daily News v. Microsoft and OpenAI

Filed April 30, 2024. Jurisdiction: Federal, Southern District of New York (SDNY). This case might more properly be called Alden Media, as the plaintiffs include media properties owned by the giant investment firm Alden Global Capital: The New York Daily News, The Chicago Tribune, The Denver Post, The Orlando Sentinel, and the San Jose Mercury-News. This case has also been assigned to Judge Stein, who has so for shown no interest in consolidating it with the Times action.

The Intercept v. OpenAI and Microsoft

Filed Feb. 28, 2024. Jurisdiction: Federal, SDNY.

The nonprofit investigative news outlet makes copyright infringement claims similar to those in the New York Times lawsuit.

Raw Story Media v. OpenAI and Microsoft

Filed Feb. 28, 2024. Jurisdiction: Federal, SDNY. The left-leaning alternative media company files its own copyright claims against OpenAI and its big backer.

Image by Lubos Houska from Pixabay

 Copyright infringement (books) 

Authors Guild v. OpenAI and Microsoft

Filed Sept. 19, 2023. Jurisdiction: Federal, SDNY. In one of the first lawsuits charging OpenAI and Microsoft with copyright infringement, the Authors Guild plaintiff leads include some of America’s most famous authors: Jodi Picoult, George R.R. Martin, John Grisham, Christina Baker Kline, Jonathan Franzen, and Roxana Robinson, among others.

This New York-based class action copyright infringement suit claims unlawful use of fiction to train LLMs. As the leading book-based AI lawsuit, it was assigned to SDNY’s Judge Sidney Stein, who consolidated it with Sancton v. OpenAI, the nonfiction version of the same claim, as well as Basbanes v. OpenAI.

Tremblay v. OpenAI

Filed June 28, 2023. Jurisdiction: Federal, Northern District of California. Tremblay is shaping up as the West Coast version of the big-tent Authors Guild case. Same cause of action (copyright infringement claims by authors against OpenAI), similar bold-face names involved. Originally filed by writers Paul Tremblay and Mona Awad last July, the case was assigned to U.S. District Court Judge Araceli Martinez-Olguin.

Since then the San Francisco-based court has consolidated Tremblay with Chabon v. OpenAI (in Oct. 2023) and Silverman v. OpenAI (in Feb. 2024). Significantly, Judge Martinez-Olguin used the Silverman suit (led by author and actress Sarah Silverman) to narrow the scope of the issue when he rejected the claim that every answer generated by OpenAI’s ChatGPT system constituted an infringing work. The judge didn’t find enough similarities between the original work and the ChatGPT output to justify the claim. He did allow the training data copyright claim to proceed, however, and folded Silverman into Tremblay. So now the entire case basically turns on that one issue.

Earlier this year the San Francisco-based Tremblay plaintiffs attempted to stay the New York-based Authors Guild case under the first-to-file rule. That civil procedure principle holds that in similar disputes, the court that first acquires jurisdiction retains the case. Tremblay’s authors basically were saying: We should be the case of record because we filed before the Authors Guild. In March, Judge Martinez-Olguin rejected that request. Both cases continue to move forward.

Other infringement (books) cases to watch

Kadrey v. Meta

Filed July 7, 2023. Jurisdiction: Federal, Northern District of California. This action against Meta, parent company of Facebook and Instagram, mirrors the authors’ lawsuits against OpenAI for copyright infringement based on the use of their works as training data for Meta’s AI model LLaMA. Authors Richard Kadrey and Sarah Silverman are the lead plaintiffs. Judge Vince Chhabria has consolidated two similar cases, Chabon v. Meta, and Huckabee v. Meta, into Kadrey.

Chabon v. Meta

Filed Sept. 12, 2023. Jurisdiction: Federal, Northern District of California. Similar copyright claims against Meta by author Michael Chabon and others. Folded into Kadrey v. Meta in Dec. 2023.

Huckabee v. Meta

Filed Oct. 17, 2023. Jurisdiction: Federal, Northern District of California. Similar copyright claims against Meta by author and former Arkansas Gov. Mike Huckabee. Originally filed in the Southern District of New York, case moved to California to be consolidated with Kadrey in Dec. 2023.

Copyright infringement (images)

Getty Images v. Stability AI

Filed Feb. 3, 2023. District: Federal, District of Delaware.

This is the big image copyright case. Getty Images, one of the world’s leading stock image providers, sues Stability AI, developer of text-to-image generative AI model Stable Diffusion. Getty alleges that its copyrighted images have been used without authorization to train Stability’s AI model, and that its trademark has also been infringed through the appearance of the Getty Images watermark in images produced by Stable Diffusion as output.

Getty’s claims are similar to those in the New York Times, Authors Guild, and Concord Music cases: Unauthorized use of copyrighted images as training data—so brazen, Getty adds, that the company’s watermark has appeared in Stable Diffusion outputs.

Stability’s legal team has attempted to move the case’s federal jurisdiction from Delaware to the more tech-friendly arena of Northern California. That hasn’t happened, but the motion has succeeded in delaying the case. The parties have spent more than a year arguing about where to base the case, with no end in sight.

Other infringement (images) cases to watch 

Andersen v. Stability AII, Midjourney, and DevantArt

Filed Jan. 13, 203 Jurisdiction: Federal, Northern District of California. Three visual artists file a lawsuit on behalf of a class of artists against Stability, Midjourney, and DevantArt, alleging unauthorized use of their work as training data.

Zhang v. Google

Filed Apr. 26, 2024. Jurisdiction: Federal, Northern District of California. Four visual artists file against Google and parent company Alphabet claiming the unauthorized use of their work as training data in the development of its Imagen text-to-image AI model.

Image by bandsintown from Pixabay

Copyright infringement (music) 

Concord Music Group v. Anthropic

Filed Oct. 18, 2023. Jurisdiction: Federal, Middle District of Tennessee.

This is the one big music case. More than a dozen of the nation’s largest music publishers (Concord, Capitol, Universal Music, Geffen, Polygram, et al) come together to file suit against the generative AI developer Anthropic. At issue, per the complaint:

In the process of building and operating AI models, Anthropic unlawfully copies and disseminates vast amounts of copyrighted works—including the lyrics to myriad musical compositions owned or controlled by Publishers. Publishers embrace innovation and recognize the great promise of AI when used ethically and responsibly. But Anthropic violates these principles on a systematic and widespread basis. Anthropic must abide by well-established copyright laws, just as countless other technology companies regularly do.

Anthropic counters:

Anthropic’s alleged intermediate use of Plaintiffs’ song lyrics to train its generative AI models is transformative, limited in scope, unrelated to the creative core of the works, and neither has nor will adversely impact any existing or potential legitimate market for Plaintiffs’ copyrighted works. 

Interestingly, Anthropic claims that “there is not currently, nor is there likely to be, a workable market for licenses to use copyrighted works to train general text-generative AI models.” This despite the well-publicized fact that OpenAI, an Anthropic competitor, has signed licensing deals to do just that, effectively establishing a market for licenses to use copyrighted works to train AI models.

Anthropic’s legal team does not like the idea of trying this case in Tennessee, home of the nation’s country music industry. They’ve attempted to move it to the more tech-friendly district of Northern California, to no avail. Chief U.S. District Court Judge Waverly Crenshaw, Jr., has the case and he’s put it on the calendar. A jury trial is scheduled to open on Nov. 18, 2025 in Nashville if the parties can’t find resolution prior to that date.

Other infringement (music) cases to watch 

Recording Industry Association of America (RIAA) v. Suno and Udio

Filed June 24, 2024. Jurisdiction: Federal, Massachusetts and SDNY, respectively. The RIAA, on behalf of the big music companies (Sony, UMG, Warner) allege mass copyright infringement against two Gen AI companies.

The Massachusetts jurisdiction is based on Suno’s residency in Cambridge, Massachusetts. The full complaint against Suno is here.

Udio is based in New York City, hence the SDNY filing. The full complaint against Udio is here.

Update: In August 2024, Rolling Stone reported that Suno has replied to the RIAA’s complaint, arguing that Suno’s use of copyrighted music falls under the legal definition of fair use.

Image by axbenabdellah from Pixabay

Copyright infringement and privacy harm (personal data)

A.T. vs. OpenAI and Microsoft

and

L. v. Alphabet

Filed July 2023 and Sept. 2023. Jurisdiction: Federal, Northern District of California.

Dismissed May 2024 and June 2024.

These two cases—class action lawsuits alleging the unlawful collection of personal data from private individuals by Google, OpenAI, and Microsoft, and the unauthorized use of that data to train AI models—are important in their dismissal. Essentially, they establish the limits of the Northern California court’s patience with regard to vague and hyperbolic claims against AI developers.

Less than a year after filing, U.S. District Court Judge Vince Chhabria sent the plaintiffs packing with a sharp hand-slap. Their 200-page complaint “contains largely irrelevant, distracting, or redundant information,” he wrote, and fails to present a clear reason why the plaintiffs are entitled to relief.  

Judge Chhabria wrote:

The development of AI technology may well give rise to grave concerns for society, but the plaintiffs need to understand that they are in a court of law, not a town hall meeting.  Because the Court has no way of telling whether the plaintiffs could adequately state a claim once all the mud is scraped off the walls of the complaint, dismissal is with leave to amend.

 In other words: Don’t come to my court with a Chicken Little case accusing AI of collapsing the sky.

A few days after Chhabria dismissed the case against OpenAI, his colleague Judge Araceli Martinez-Olguin dismissed the case against Alphabet based on the same principles.

The message they sent was clear: AI lawsuits tried in Northern California will be decided on the specific merits of the claims based on established law—and not on potential risks or generalized fears raised by AI systems themselves.  

 

Harmful outcome of AI usage lawsuits

 

Harmful outcome lawsuits largely turn on harm to an individual person inflicted as a result of AI-driven decision making. They range from wrongful arrest lawsuits to libel to digital redlining in housing and employment. In most cases the lawsuits are based in well-established laws like the Fair Housing Act and the Civil Rights Act of 1964.

Image: Pixabay

Harmful outcome (wrongful arrest)

Oliver v. Detroit

Filed Oct. 6, 2020. Federal, Eastern District of Michigan.

Williams v. Detroit

Filed Apr. 13, 2021. Federal, Eastern District of Michigan

Woodruff v. Detroit

Filed Aug. 3, 2023. Federal, Eastern District of Michigan.

All three of these cases allege wrongful arrest by the Detroit Police based on information obtained from AI-driven facial recognition systems.

Parks v. McCormac

Filed Mar. 3, 2021. Federal, District of New Jersey.

Parks alleges wrongful arrest based on the use of AI-driven facial recognition systems in the town of Woodbridge, New Jersey.

 Reid v. Bartholomew

Filed Sept. 8, 2023. Federal, Northern District of Georgia.

Plaintiff Randal Reid alleges wrongful arrest and imprisonment, violation of Fourth Amendment rights. Reid was arrested and held without bond for six days after being misidentified by an AI-driven facial recognition system used by the Jefferson Parish, Louisiana), Sheriff’s Department.

Murphy v. Essilorluxottica USA and Macy’s

Filed Jan. 18, 2024. District Court of Harris County, Texas.  

Texas man sues Macy’s and the parent company of Sunglass Hut over their alleged use of an AI-driven facial recognition system that misidentified him as an armed robber, leading to his wrongful arrest. While in jail awaiting trial, Murphy alleges he was assaulted and sustained lifelong injuries.

Harmful outcome (digital redlining, housing)

United States v. Meta

Filed June 21, 2022. Federal, SDNY  

Settled June 27, 2023

In this first case challenging algorithmic bias under the Fair Housing Act, federal prosecutors alleged that Meta’s housing ad system (Facebook and Instagram) discriminated against consumers by excluding some users from receiving certain housing ads based on their FHA-protected characteristics. A settlement, reached one year later, required Meta to stop using its noncompliant advertising tool, and to create a new system for housing ads to address the bias caused by its personalization algorithms. Meta will be subject to oversight and compliance reviews (conducted by a third party) through June 27, 2026.

Williams v. Wells Fargo Bank

Filed Feb. 17, 2022. Federal: Northern District of California

Six separate class-action lawsuits against Wells Fargo Bank have been consolidated under the umbrella of Williams v. Wells Fargo. All bring allegations of algorithm-driven discrimination with respect to residential mortgage and refinance practices, violations of the Fair Housing Act and the Equal Credit Opportunity Act. Plaintiffs claim that Wells Fargo uses its “pioneering automated underwriting” system, known as CORE, without sufficient human supervision or involvement, and that CORE’s algorithm and machine learning are infected with racial bias.

Open Communities v. Harbor Group Management

Filed Sept. 25, 2023. Federal, Northern District of Illinois.

Resolved with consent decree Jan. 23, 2024.

An investigation by the fair housing advocacy group Open Communities found that Harbor Group’s use of AI tools at its apartment complexes across the country consistently led to discriminatory outcomes. The lawsuit alleges that Harbor Group “intentionally employed PERQ Artificial Intelligence automated systems to communicate a blanket ‘no Housing Choice Voucher/No Section 8 Policy’ policy to reject Internet rental applications from individuals who receive housing assistance payments.” Result: Parties entered into a two-year consent decree wherein Harbor Group will work with Open Communities to ensure Harbor develops and applies fair housing procedures in its property operations, including regular reviews of AI-generated responses to prospective tenants.

Image by Gerd Altmann from Pixabay

Harmful outcome (digital redlining, employment)

Mobley v. Workday

Filed Feb. 21, 2023. Federal: Northern District of California

Plaintiff Mobley, an African-American man over the age of 40, alleges violation of the Civil Rights Act of 1964, the Age Discrimination in Employment Act of 1967, and the ADA Amendments Act of 2008 because Workday’s AI-driven employment systems and screening tools rely on algorithms and inputs that discriminate against job applicants based on race, age and or disability.

An especially compelling question in Mobley involves liability. If an AI-driven application screening program discriminates against a protected class (race, age, disability), who is liable for that discriminatory act—the AI system developer/provider, or the client company using the system in its hiring process, or both? This is a pressing and contentious question in many AI-related bills now making their way through state legislatures.

Harmful outcome (libel)

Walters v. OpenAI

Filed July 14, 2023. Federal, Northern District of Georgia.

This first-of-its-kind libel suit against a generative AI product may test the legal liability of false outputs known as hallucinations. It has bounced back-and-forth between state and federal court. As of early 2024, it resides in Gwinnett County Superior Court in Georgia.

Plaintiff Mark Walters, a Georgia resident, is the host of Armed America Radio, a Second Amendment advocacy program. He is not now nor ever has been associated with the Second Amendment Foundation, a nonprofit based in Washington State.

In May 2023, the Second Amendment Foundation (SAF) filed a federal lawsuit against the Washington State Attorney General for what SAF alleges is illegal harassment in the form of an ongoing investigation meant to curb SAF’s free speech rights. In the course of reporting on that lawsuit, a third-party journalist used OpenAI’s chatbot, ChatGPT, to summarize the accusations contained in the 30-page legal complaint. The chatbot responded with a fraudulent hallucination that posited the lawsuit as a complaint against Walters, “who is accused of defrauding and embezzling funds from the SAF.” ChatGPT went into great detail about Walters’ position as SAF’s Treasurer and CFO, the alleged misappropriated funds, “breach of fiduciary duty and fraud,” and the attempted removal of Walters from SAF’s board of directors. None of this is true. Walters is not involved with SAF or the lawsuit in any way.

The journalist brought this to the attention of Walters, who sued OpenAI for libel. Per his complaint:

ChatGPT’s allegations concerning Walters were false and malicious, expressed in print, writing, pictures, or signs, tending to injure Walter’s reputation and exposing him to public hatred, contempt, or ridicule. By sending the allegations to Riehl, OAI published libelous matter regarding Walters.

OpenAI has responded by seeking shelter within the strict conditions set by U.S. libel law. The company argues that ChatGPT’s outputs are “probabilistic,” which can have any number of meanings, one of which is “not necessarily true.” Further, OpenAI holds that ChatGPT’s output does not constitute publication, that Walters is a public figure by virtue of his radio show, and that ChatGPT’s hallucinations are constructed without “actual malice.” In early 2024, the Georgia judge rejected OpenAI’s move for dismissal, and the case continues.  

Freedom of Information, AI transparency

Rayner v. New York State Department of Corrections

Filed Nov. 15, 2022. Albany County, New York State, Supreme Court. Decided Sept. 14, 2023.

This case has potential implications for the issue of transparency with regard to the datasets used to train AI models.

In 1998, the New York State Department of Corrections Parole Board began using a digital risk assessment tool known as COMPAS, developed and licensed by the company Equivant. The COMPAS algorithm generates a risk level—low, medium, or high—by analyzing a reentry questionnaire. In 2022, Fordham Law School Professor Martha Rayner filed a Freedom of Information request for insight into the workings of COMPAS. That request was denied. On court appeal, Rayner requested information about the formulas COMPAS used to calculate risk, cutoff points for each level of risk, the data group against which the Parole Board scored each applicant, and all validation studies done on the NYCOMPAS program.

Rayner’s request for information about COMPAS—to learn the basis on which parole applicants were being judged as high- or low-risk offenders—bears more than a passing resemblance to the call for transparency around the training data used to create today’s most powerful AI models like ChatGPT.

How’d it go for Rayner? Not well. The Department of Corrections said that the records responsive to her request “were determined to be exempt from disclosure as trade secrets or material that would cause substantial competitive harm to Equivant if disclosed.” In Sept. 2023 the Albany County Supreme Court agreed. Rayner’s petition was dismissed. No mention of the public interest in ensuring the accuracy, fairness, and transparency of the COMPAS system was mentioned in the Court’s decision.

Previous
Previous

FAQ: Is AI training data a trade secret?

Next
Next

California’s Training Data Transparency Act moves to Senate floor