Substantial similarity determination of computer software under the condition of no source code
The basis for determining infringement of computer software copyright is similar to that of traditional copyright infringement. In addition to proof of rights, it mainly includes two aspects: "possibility of contact" and "substantial similarity". In the case of source code comparison, it is relatively simple to compare the source code through text. However, in reality, it is often difficult to obtain the source code of the accused infringing software, and other methods need to be used to determine substantive similarity. This paper discusses how to determine the substantive similarity of software by analyzing the software disassembly code without source code in the case of (2021) Jingminchu No. 4 case of American Photography Company v. Tiktok Company.
Method for determining infringement of computer software copyright
The key to determining infringement of computer software copyright lies in the two elements of likelihood of contact and substantive similarity. The possibility of contact means that the defendant has the opportunity to come into contact with the plaintiff's software, while substantive similarity means that the defendant's software is essentially similar to the plaintiff's software, not coincidental or independently developed.
The possibility of contact is not a necessary condition for determining computer software infringement. When the defendant may have contact with the software of the right party, the burden of proof for the similarity between the accused infringing software and the plaintiff's software can be reduced. For example, in the Beijing High Court (2021) Jingminchu No. 4 case, American Pictures provided evidence to prove that its software has a certain market reputation, and Xie, an employee of Tiktok Technology and its affiliated companies, once worked in American Pictures. These evidences show that Tiktok Technology Co., Ltd. has the opportunity to access the software of American Photography Co., Ltd., which meets the requirements of access possibility.
Substantive similarity refers to the similarity between the defendant's software and the plaintiff's software in terms of specific performance, which is not a coincidence but is caused by plagiarism or unauthorized copying. In the case of source code, a detailed comparison can be made through the source code. But in the absence of source code, other methods need to be used, such as analyzing the disassembly code of the software.
Substantive similarity determination without source code
In the absence of source code, disassembling code to determine the substantive similarity of software is an effective means. Specifically, analysis can be conducted from the following four aspects: substantial similarity in function names, substantial similarity in function names, substantial similarity in function code, and substantial similarity in function code.
(1) The substance of function names is the same
The function name is usually composed of English words or abbreviations, which can reflect the function of the function. The substantial similarity of function names means that the function names use the same English words, or there are differences in function names, but the main English words and abbreviations are the same, and the differences have specific corresponding relationships. For example, most function names in a software have the same main body, but the prefix changes from "AAA" to "BBB", which is likely the result of batch modification.
In the (2021) Jingminchu No. 4 case, the appraisal opinion of the National Industry and Information Security Center pointed out that there were several function names in the Tiktok software that were substantially the same as those in the American photography software. For example, functions in the beauty photography SDK software
“NvImageBufferGetSizeInBytes”
The corresponding function name in the Tiktok software is
“TEImageBufferGetSizeInBytes”Functions in the Beauty Photography SDK software
“NvCalcCanonicalBoundingRectFromImagePos”
The corresponding function in the Tiktok software is
“TECalcCanonicalBoundingRectFromImagePos”
The two are highly consistent in word selection, order, type, and quantity.
(2) Substantive similarity of function names
The similarity of function names is determined by considering the English words (and abbreviations) of class names and function names to form a similar situation. For example, if there are detailed changes in the capitalization or part of speech of individual English words, while other parts are substantially the same, it can be considered that the function names are substantially similar.
“CNvBaseStreamingGraphNode::IsInputPinResovled”
And Tiktok software
“TEvBaseStreamingGraphNode::isInputPinResovled”
The class names and class member functions of both are highly similar, with only a difference in capitalization between the words Is and is in the class member functions. Additionally, the same spelling error "Resolve" is retained in both software, indicating substantial similarity in function names.
(3) Substantive similarity in function code
In the (2021) Jing Min Chu No. 4 case, the National Industry and Information Security Center found that the assembly codes of multiple functions were highly consistent in operation codes and operands, even the same spelling errors, by comparing the disassembly codes of the Tiktok software and the American Photography SDK software. For example,
“NvGetMatchedFormatFromOpenGLInternalFormat”
The corresponding function name of the function in the Tiktok software is
“TEGetMatchedFormatFromOpenGLInternalFormat”
The two not only have the same names, but also have similar code implementations.
(4) Substantial similarity of function code
The substantial similarity of function code refers to the similarity in implementation logic and functionality of certain function codes between two software, although there are differences in specific implementations. This can be analyzed through similarity matching algorithms, combined with manual judgment, considering factors such as the impact of assembly statement differences on the main functions of functions.
In the aforementioned case of American Photography v. Tiktok, the National Industrial and Information Security Center adopted a function similarity matching algorithm to compare the assembly code of Tiktok software and American Photography SDK software, and found that the code of multiple functions has a high degree of similarity in implementation logic, although there are some differences in specific operation codes and operands. The appraisal agency confirms that these function codes are essentially similar by comprehensively considering the similarity of functions and the differences in assembly statements.
The defense path of the infringing party
(1) Exclude public domain codes
Public domain code refers to code that is not protected by copyright, usually including standard library functions, third-party library functions, etc. These codes belong to the public domain and can be freely used by anyone, and should not be considered as part of infringement.
In the above case of American Photography v. Tiktok, the National Industrial and Information Security Center preprocessed the code before assembly code comparison, excluding C++standard library functions, Windows library functions and third-party library functions. Therefore, the influence of public domain codes has been excluded from the identification results.
(2) Excluding limited expression content
Limited expression refers to ideas or information that can only be expressed in a limited way due to their nature or function. In this case, copyright law typically does not protect these limited expressions as they are seen as inevitable outcomes of thought rather than creative expressions. In software development, code implementations that belong to common practices or industry conventions are also limited in expression. These code implementations are usually standard, generic, and lack originality, therefore they should not be considered as part of infringement.
For example, code consists only of member variable definitions and corresponding methods such as getters and setters, and member variables are commonly named. This type of code is a recommended standard writing style for programming languages. This type of code function usually has smaller assembled code, and assembly instructions are usually less than 10 lines.
In the case of American Photography v. Tiktok, the National Industry and Information Security Center set a comparison rule of more than 10 lines of code in the comparison process, and compared the results for manual analysis to ensure that the impact of industry practices and limited expression content is excluded.
Proof of substantial similarity
In the case where the defendant is unwilling to provide source code for comparison, coupled with the existence of the possibility of contact, the court can reduce the burden of proof of substantive similarity. Specifically, in view of the defendant's refusal to submit source code, which makes it impossible to directly confirm the substantive similarity of the software through source code comparison, the court can infer that the defendant's software is substantially similar to the plaintiff's software based on the comparison results and likelihood of contact of the disassembly code.
epilogue
The defendant has the right to raise objections to exclude public domain codes and limited expression content, and the judicial institution should fully consider these objections in the comparison process of the appraisal institution. If the defendant refuses to provide source code for comparison, the court may reduce the requirement for proof of substantial similarity and make a reasonable judgment based on existing evidence.
Related recommendations
- Are all the people detained in the detention center bad guys?
- The unity of arrest and prosecution should be a "combination of appearance and separation of spirit"
- How to protect the rights and interests of workers under the compensatory leave system?
- The difference between traditional pledged assets and data pledged assets