ad
ad
Topview AI logo

CCC at Frankfurt Book Fair 2020 - AI Extraction: Text and Data Mining - U.S. Approach

Education


Introduction

Roy Kaufman, Managing Director of Business Development at the Copyright Clearance Center (CCC), presented at the Frankfurt Book Fair 2020 on the intricate relationship between copyright law and AI extraction, particularly in regard to text and data mining (TDM) in the United States. His discussion aimed to clarify the legal landscape surrounding the usage of copyrighted material for AI extraction without a formal license.

Kaufman acknowledged the complexity of U.S. copyright law, especially as it pertains to AI extraction and TDM, both of which are critical in the realm of modern technology. He emphasized that while licensing materials through agreements or open access provides clear guidance on usage, a lack of license raises significant legal questions.

He illustrated the overlapping nature of AI extraction and TDM, presenting them as distinct yet related concepts. Kaufman noted that current laws do not adequately address these issues, leading courts and legislatures to respond to outdated scenarios rather than contemporary realities. He described the process of AI extraction as analogous to using a car; while the vehicle (software) is crucial, the key consideration is how it is utilized, which directly ties back to copyright law.

At the heart of this legal discourse is the doctrine of fair use. Kaufman pointed out a common misunderstanding: fair use cannot simply be equated with AI extraction or TDM. Instead, a thorough analysis of the "actual use" of copyrighted materials is necessary. This includes investigating the purpose of the data mining or AI extraction to determine if it falls under the fair use protections of U.S. copyright law.

To expand on the fair use concept, Kaufman referenced the Berne Convention's three-step test, which influences international copyright exceptions. He particularly highlighted the second step, which states that reproduction of copyrighted works must not conflict with their "normal exploitation." This is crucial as what qualifies as "normal exploitation" can rapidly change, complicating the legal landscape for text and data mining.

Kaufman discussed notable legal cases that influence this discourse, particularly focusing on the Google Books case and the HathiTrust case. While each case involved the reproduction of print materials for scholarly purposes, the decisions focused narrowly on the specific uses, resulting in limited guidance on TDM or AI extraction. He highlighted an important moment in these proceedings, where Google halted scanning journals after recognizing that their commercial use could lead to complicated legal implications.

Lastly, Kaufman referenced the TBI's case involving audio-visual content, which further illustrates the challenge of determining fair use in the face of evolving technological capacities. The court ruled against fair use when considering commercial motives, reiterating the need for licenses for such use of copyrighted works.

In conclusion, Kaufman stressed that understanding how U.S. copyright law intersects with AI extraction and TDM is complex and continually evolving. As technology progresses, so too do the interpretations and applications of existing laws.

Thank you for your attention, and I hope this session has provided clarity on these critical issues.

ad

Share

linkedin icon
twitter icon
facebook icon
email icon
ad