The first AI sound infringement case in the country has been sentenced—— How can generative artificial intelligence avoid infringement?

2024 04/25

On April 23, the Beijing Internet Court delivered a judgment on the first instance of the "first case of AI voice infringement in China" (hereinafter referred to as "AI voice case"), clearly affirming that, on the premise of identifiability, the protection scope of the rights and interests of natural persons' voices can extend to AI generated voices. The content generated by artificial intelligence often relies on certain materials for machine learning and training, which are then transformed into AI artifacts with a certain appearance. In this case, the AI sound generated using text to speech software (hereinafter referred to as the "product in question") is called a "recorded work". The five defendants in this case, including the design developer, voice material provider, seller, buyer, and user of the product in question, can be considered as the main entities involved in the generative artificial intelligence industry. It is worth pondering how to avoid infringement risks and better promote industrial development.


1、 Basic Case of AI Sound Case [1]


The plaintiff Yin is a voice actor. After being informed by a friend, the plaintiff discovered that works made by others using his voice acting have been widely circulated on multiple well-known apps. After sound screening and tracing, it was found that the sound in the above works comes from a text to speech product operated by the defendant, a Beijing intelligent technology company. Users can achieve the function of converting text to speech by inputting text and adjusting parameters. The plaintiff has accepted the commission of defendant two, a cultural media company in Beijing, to record audio recordings, and defendant two is the copyright owner of the audio recordings. The second defendant provided the audio of the recorded audio products recorded by the plaintiff to a software company of the third defendant, allowing the third defendant to use, copy, and modify the data for commercial or non-commercial purposes for their products and services. Defendant three only used the recorded audio products of the plaintiff as materials for AI processing, generated the involved text to speech products, and sold them to the public on a cloud service platform operated by a network technology company in Shanghai in Defendant four. Defendant 1, an intelligent technology company in Beijing, signed an online service sales contract with Defendant 5, a technology development company in Beijing. Defendant 5 placed an order to purchase from Defendant 3, which included text to speech products involved in the case. Defendant 1, an intelligent technology company in Beijing, adopted an application program interface format and directly retrieved and generated text to speech products for use on its platform without technical processing.


The plaintiff claims that the defendant's actions have seriously infringed upon the plaintiff's voice rights. Defendants one, a certain intelligent technology company in Beijing, and three, a certain software company, should immediately stop infringing and apologize. The five defendants should compensate the plaintiff for their economic and mental losses.


2、 A Brief Analysis of AI Sound Cases


(1) Scope of protection for the voice rights and interests of natural persons


The plaintiff's lawsuit in this case is a dispute over infringement of personality rights, based on Article 1023 of the Civil Code, which stipulates that "the protection of the voice of natural persons shall refer to the protection of the right to portrait", and the right to voice shall be protected as a special personal interest. The independent compilation of personality rights is a major feature of China's Civil Code. The personality rights protected by China's Civil Code are divided into general personality rights and specific personality rights. The former refers to rights that are based on personal freedom and dignity, and have high generalization and collective characteristics of rights; The latter refers to rights such as the right to life, body, health, name, portrait, reputation, honor, privacy, and marital autonomy. The law clearly states that the protection of the voice of natural persons is based on the application of the right to portrait, which has the following meanings: firstly, natural persons only have legal rights and interests in the voice, and their protection has not risen to the level of "rights", and does not have the absolute and universal validity of the specific personality rights specified by the law; Secondly, when protecting the voice of natural persons, reference can be made to the elements of infringement of portrait rights.


Portrait is an image that uses photography, sculpture, video recording, painting, electronic digital technology, and other means to depict all or part of a natural person's facial features, physical features, limb features, or other identifiable features in a material or virtual material form, and can be perceived primarily through visual means. The most basic function of a portrait is recognition. Portrait right is the exclusive control granted by law to natural persons to reproduce their own image, which includes two attributes: one is the dominant attribute of active action, such as the right to use portraits; The second is the inviolability attribute of passive defense, such as maintaining the right to portrait integrity. Therefore, it can be seen that the essential similarity between sound and portrait lies in their recognizability, and the protection scope of sound rights should include two aspects: active domination and utilization of sound, as well as passive defense.


(2) Does the defendant's behavior constitute infringement


According to the previous text, the infringement behavior in this case - the AIized use and sales of the plaintiff's voice materials using text to speech software that has a "substantial similarity" to the plaintiff's voice, whether it constitutes infringement depends crucially on whether the AIized use of the plaintiff's voice materials is infringing. Firstly, the plaintiff's voice in the case must have recognizability and be able to locate a specific natural person in order to be protected as a specific personality right; Secondly, whether the defendant used the plaintiff's voice without permission for commercial purposes; Finally, it is necessary to examine whether the authorized use of audio recordings by the plaintiff can be extended to AI based use.


After hearing, the Beijing Internet Court held that the voice of a natural person is unique, unique and stable, which is distinguished by voiceprint, timbre and frequency. It can form or cause ordinary people to generate thoughts or emotional activities related to the natural person, and can show their behavior and identity to the outside world. The recognizability of a natural person's voice refers to the ability to recognize a specific natural person through the characteristics of that voice, based on repeated or long-term listening by others. The use of artificial intelligence to synthesize sound can be recognized as having recognizability if it can be associated with the natural person by the general public or the public in related fields based on their timbre, intonation, and pronunciation style. Therefore, the plaintiff's voice can receive legal protection.


Defendant 2, a cultural media company in Beijing, enjoys copyright and other rights over the recorded products, but does not include the right to authorize others to use the plaintiff's voice for AI. Defendant 2 and Defendant 3 signed a data agreement, authorizing Defendant 3's software company to AI use the plaintiff's voice without the plaintiff's informed consent, which has no legal source of rights. Therefore, the defense of Defendants 2 and 3 regarding obtaining legal authorization from the plaintiff cannot be established. In summary, the defendant's actions constitute infringement. Among them, Defendant 2, a cultural media company in Beijing, and Defendant 3, a software company, used the plaintiff's voice through AI without the plaintiff's permission, which constitutes an infringement of the plaintiff's voice rights and interests. Their infringement behavior has caused damage to the plaintiff's voice rights and interests, and they should bear corresponding legal responsibilities. Defendant 1, an intelligent technology company in Beijing, Defendant 4, a network technology company in Shanghai, and Defendant 5, a technology development company in Beijing, have no subjective fault and are not liable for damages.


3、 How can generative artificial intelligence avoid infringement?


In October 2022, OpenAI released the Big Language Model Artificial Intelligence ChatGPT (Generative Artificial Intelligence), marking a new stage of development for Generative AI (AI) technology. Artificial intelligence generated products bring many challenges to the copyright system, mainly including the issue of copyrightability of artificial intelligence generated products, the ownership of generated expression rights, the labeling of generated expression sources, and the resulting copyright infringement liability issues. [2] In the event, there have been cases such as DreamWriter [3], Wicko Advanced Database [4], and the "Stable Diffusion" copyright case [5], which mainly involve discussions on the copyrightability of artificial intelligence products. As the first case of AI sound infringement, this case further reveals how to standardize industry operations in the entire process of designing, developing, selling, operating, and using generative artificial intelligence products using materials with prior rights (or interests). Previously, as an emerging industry, generative artificial intelligence showed a barbaric development trend. However, with more capital investment and more efficient output, it will inevitably involve more subjects and more interests. Behind the legal disputes is the imbalance of market subject interests, and it is foreseeable that more related infringement disputes will occur in the future. The AI Voice case continues to clarify for all parties involved in this industry and those who are about to enter this field: by making judgments, application boundaries for new formats and technologies are defined, and a judicial attitude that balances protecting personal rights and guiding technology towards goodness is consistently emphasized.


Therefore, in order to avoid the risk of infringement, it may be possible to start by standardizing the software development process, clarifying the scope of rights authorization, stabilizing the chain of rights authorization and licensing, and improving the platform's reasonable duty of care. With the continuous development of the Internet and AI, AI technology is widely used in various fields, and the legal disputes of generative AI will become increasingly complex and diverse. Maintaining a stable social order, promoting the healthy development of industry, and achieving good law and good governance are inseparable from the continuous improvement of legislation and justice, and also need the attention of academia, industry, and practitioners. All improvement plans should ultimately achieve a balance between the rights holders of AI software materials, the copyright holders of works, the owners of work carriers, and the public in the digital technology era, in order to achieve sustainable development of the industrial chain. [6]


References and comments (slide down to view)


[1] "The first case of AI generated voice infringement in China", published on the official account of "Beijing Internet Court", April 23, 2024.
[2] Cao Xinming and Ma Zibin: "The Challenge and Response of Generative Artificial Intelligence to Copyright System", published in the 6th issue of China Copyright in 2023.
[3] Please refer to the Civil Judgment (2019) Yue 0305 Min Chu 14010 of Nanshan District People's Court in Shenzhen.
[4] See Beijing Internet Court (2018) J0491 MC239 Civil Judgment and Beijing Intellectual Property Court (2019) J73 MC2030 Civil Judgment.
[5] See Beijing Internet Court (2023) J0491 MCH 11279 Civil Judgment.
[6] Sun Shan: "The Governance Logic of Overprotection of Works in the Digital Technology Era", published in Science and Technology Publishing, Issue 2, 2024.