Whenever
these two words—copyright law and AI—are heard together, the very first
corollary drawn is with respect to the challenge of harmonizing the former with
the latter. Copyright law is, by effect, applicable to works of an author, but
nowhere is AI mentioned as an author, so to give AI the cloak of an author and
AI's work the protection of Copyright Law was the very first question for
consideration.
With the advancement of AI, such challenges have been identified that call for a significant overhaul in the current laws of Copyright and laws governing AI. While this challenge of harmony between work generated by Artificial Intelligence and Copyright Law is being attended to, a new challenge has come up calling for attention.
The Legal Battle
In the backdrop of advancement, AI has used copyrighted content to train itself without the author's permission. ANI, Asian News International, a prominent news agency that has a big role in providing multi-media news to various agencies, covers all news from all over India, discovered that Chat GPT, a prominent Large Language Model, an artificial intelligence chatbot that has the capabilities of generating human-like responses, was generating human-like responses from sources of ANI's news articles without authorization. This led to ANI filing a copyright infringement suit before the Delhi High Court against Open AI, a leading AI research company.
ANI,
therefore, filed a copyright infringement suit before the Delhi High Court
against Open AI. The case, which is complex and multifaceted, revolves around
the unauthorized use of ANI's content by Open AI's Chat GPT. ANI firmly held
its ground, asserting that Open AI should have obtained permission or respected
its licensing agreement. In contrast, Open AI, as the respondents, challenged
the jurisdiction of Indian Courts and argued that their training for Chat GPT
is done from publicly available data, thus no copyrighted content is used.
A
parallel case worth mentioning is The New York Times lawsuit against Open AI in
2023 concerning data scraping and unauthorized content use. While that case
proceeds in the U.S., the ANI case presents an opportunity to observe how the
Indian legal system will respond to these emerging concerns. This response
could potentially shape the future of Copyright Law with respect to Artificial
Intelligence, leading to new regulations and practices.
Investigation Bottlenecks
What are LLMs? Large Language Models are a type of foundational AI model developed using a huge dataset, enabling them to comprehend and produce human-like language and other content formats across diverse tasks. What number is a considerable amount? LLMs can study approximately 3.10 billion pages of data at a single time. This number is not as huge as it can get; it's bound to increase with the advancement. With such a large number of data to study from, it would be next to impossible for anyone to know what a model like Chat GPT is learning on the internet to find copyrighted content being studied.
Furthermore, after studying all the data and learning like a human, LLMs tend to rephrase what they have studied while answering a particular prompt. This rephrasing, akin to human explanation, makes it a complex and resource-intensive task to backtrack and analyze the response, especially when it involves copyrighted content. This complexity adds another layer of challenge to the legal battle between AI and Copyright Law.
____________________________________________________________________
~ Purav Garg
Research Associate, World Cyber Security Forum
(B.Com LLB (H), UILS, Chandigarh)
Comments
Post a Comment