Day 28: Does AI Violate Copyrights?
A couple years ago, I published a work for hire with a leading legal publisher. Today, I received a letter from the publisher I wrote for, informing me that any content prepared for and owned by them could not be uploaded to Large Language Model (LMM) like ChatGPT, or any other form of generative AI. Using this content in an LLM, they stated, would violate their copyright.
Who Owns Copyrights to Legal Publications?
A work for hire is just what it sounds like: the publishing company approaches you and offers you a contract (and fee) to write something they think their audiences would be interested in. Under this arrangement, the publisher almost always owns the copyright.
Legal casebooks and many practice-specific legal publications are produced this way. In contrast, with standard law review articles and monographs, the copyright is usually held by the author.
In either case, there’s a copyright owner who may or may not consent to their content being uploaded to AI. Does it matter?
Is Generative AI Protected as Fair Use?
Actually, the legal case for that seems pretty unsettled. As The Verge reported last year, experts disagree about the intellectual property ramifications of LLMs, and no one knows where the lines will be drawn in court.
The easier question appears to be whether you can copyright the outputs of an AI model. The answer is no, if it was produced solely by a machine. There’s a grey area that comes in, however, if there was substantial human input into the ultimate product.
When it comes to IP questions surrounding inputs - like the rights involved in the letter I received today - there’s a distinction to draw between training the model and generating content using that data. According to Daniel Gervais, a professor at Vanderbilt Law, training the model using copyright-protected content might be protected under the fair use doctrine, while generating content using that content might not be.
Under this theory, use of my publisher’s copyrighted material in an LLM would not be a violation. (There may be a separate issue as to whether I, as the non-holder of the copyright, would violate any laws or duties if I affirmatively acted to supply the content to an LLM without the publisher’s consent, but if the training of the model itself were fair use, it’s hard to see how my facilitation of it would be a violation.)
A Doctrine to Be Named Later
I’m not an intellectual property specialist, so I’ll leave the divining on these questions to the experts. (And, as we near MLB’s trade deadline next week, I’ll borrow a phrase as a placeholder.)
But it’s worth noting that battle lines are being drawn on this issue. And if you use generative AI, particularly in your business, it’s an issue you may want to track carefully.