Method and its system of Java source and byte code plagiarism detection
-
摘要: 提出一种Java源代码和字节码都适用的剽窃检测方法并实现了支持系统,该方法以类的Java文件或class文件为比较单元,从中抽取代表程序语法和语义特征的5种特征向量,综合计算产生两个类文件之间的相似度,可用于帮助判断两个类文件之间是否存在全部或部分剽窃现象.在人工修改程序的场景下进行的对比实验结果和剽窃检测实验结果表明,该方法可有效检测程序代码的严格拷贝和近似拷贝,有较高的检测性能,并且能够识别程序剽窃行为中对Java源文件所做的大部分类型的代码变换.Abstract: A plagiarism detection approach to detect both Java source code and byte code was proposed. The proposed method compares Java source files or class files by multiple similarity measures developed to represent the syntax structures and semantic features of the programs. An efficient plagiarism detection tool using the proposed technique was developed to analyze plagiarism behavior of Java source code or class code. Statistical analysis and several graphical visualizations aid in the interpretation of analysis results. An experimental comparison with a typical commercial source code plagiarism detection tool as well as a case study by applying the tool to plagiarism detection with a set of manually modified programs were conducted. Experiment results show that the tool is more efficient and the proposed technique can recognize both exact copy and approximate copy, including those most of the types of source code transformations in program plagiarism behavior.
-
Key words:
- plagiarism detection /
- Java source code /
- Java byte code /
- similarity measurement
-
[1] Matthias R.Effective clone detection without language barriers .Switzerland:Institut fur Informatik und angewandte Mathematik,Bern University,2005 [2] Verco K L,Wise M J.Software for detecting suspected plagiarism: comparing structure and attribute counting systems //John R.Proceedings of 1st Australian Conference on Computer Science Education.New York:ACM,1996:81-88 [3] Baker B S,Manber U.Deducing similarities in Java sources from byte codes //Douglis F.Proceedings of Usenix Annual Technical Conference.Louisiana: USENIX,1998 [4] Tamada H,Nakamura M,Monden A.Design and evaluation of birthmarks for detecting theft of Java programs //Proceedings of the IASTED International Conference on Software Engineering.Innsbruck:IASTED,2004:569-574 [5] Parker A,Hamblen J.Computer algorithms for plagiarism detection[J].IEEE Transactions on Education,1989,32(2): 94-99 [6] Faidhi J A,Robinson S K.An empirical approach for detecting program similarity and plagiarism within a university programming environment[J].Computers and Education,1987,11(1):11-19 [7] Simian:Similarity analyser .Australia:Redhill Consulting LTD,2008 .http://www.redhillconsulting.com.au/products/simian/
点击查看大图
计量
- 文章访问数: 3069
- HTML全文浏览量: 252
- PDF下载量: 2268
- 被引次数: 0