Two-phase software clustering method based on complex network theory
Qian Guanqun, Zhang Lin, Zhang Li*
School of Computer Science and Technology, Beijing University of Aeronautics and Astronautics, Beijing 100191, China

Abstract�� GN(Girvan-Newman) algorithm, a famous community detection algorithm, is introduced into software clustering. In order to overtake the weakness of high computation complexity and avoid generating small scale modules, a two-phase software clustering method is proposed. Firstly, cluster software based on its structure pattern. 3 structure patterns are identified, including: star structure, link structure and topology similarity structure. Cluster these structure patterns could efficiently reduce the scale of software network. Secondly, use modified GN algorithm to cluster software. If the remove of the edge with maximal betweenness would produce a module whose scale is smaller than the value set in advance,this remove action is forbidden. The edge with secondly maximal betweenness is tried. The experiment results show that the two-phase clustering algorithms can improve the effect of software clustering and be applied in the large-scale software.
Keywords�� legacy system   reverse engineering   reengineering     
Received 2008-11-20;

Qian Guanqun, Zhang Lin, Zhang Li.Two-phase software clustering method based on complex network theory[J]  JOURNAL OF BEIJING UNIVERSITY OF AERONAUTICS AND A, 2009,V35(12): 1438-1442
