Artificial intelligence (AI) has ushered in transformative capabilities, but it's also raised significant legal challenges, particularly concerning copyright issues. AI models, including powerful language-generating models like GPT-4, require vast datasets for training. These datasets often contain copyrighted materials, leading to a series of lawsuits against major tech companies, such as OpenAI, Meta, Microsoft, and Google, and smaller AI startups.
Several writers, including comedian Sarah Silverman and novelist Michael Chabon, have filed lawsuits against tech giants, alleging copyright infringement. These lawsuits question the legality of training AI models on copyrighted books and other materials. While these cases could have far-reaching consequences for AI models, they may not reach court for several years.
One case involving copyright and AI already headed to trial is the 2020 lawsuit by media company Thomson Reuters against Ross Intelligence. The lawsuit alleges that Ross Intelligence attempted to scrape copyrighted legal summaries from Westlaw, a Thomson Reuters-owned legal research service. While this case doesn't directly relate to language models, it raises important copyright and fair use questions, with a trial tentatively scheduled for May 2024.
One of the key legal considerations in these cases is whether using copyrighted materials in AI models falls under fair use—a doctrine promoting creative expression. Fair use protects "transformative" works, which significantly differ from the original material. However, when AI models are viewed as directly competing with the source material, as in the Thomson Reuters vs. Ross Intelligence case, fair use arguments weaken, potentially favoring the copyright holders.
The worst-case scenario for AI companies involved in these legal disputes is that they could be held liable for copyright infringement, leading to a complete halt in model development. It's challenging to filter out copyrighted content from AI models once they're generated. Damages for copyright infringement could be astronomical, potentially forcing companies to shut down their AI projects.
Microsoft President Brad Smith has promised legal responsibility and compliance with copyright laws to customers using AI tools. Companies may choose to settle rather than risk adverse court rulings. Microsoft, like other tech giants, is carefully training its AI models not to produce copyrighted content.
The lawsuits against AI companies raise complex questions about copyright, fair use, and the future of AI models. While the legal battles may take years to resolve, they underscore the need for AI companies to address these copyright concerns, potentially shaping the future of AI development and its relationship with creative works.