Your browser does not support JavaScript!

Comparative study of clustering algorithms and abstract representations for software remodularisation

Comparative study of clustering algorithms and abstract representations for software remodularisation

For access to this article, please select a purchase option:

Buy article PDF
(plus tax if applicable)
Buy Knowledge Pack
10 articles for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Your details
Why are you recommending this title?
Select reason:
IEE Proceedings - Software — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

As valuable software systems become older, reverse engineering becomes increasingly important to companies that have to maintain the code. Clustering is a key activity in reverse engineering that is used to discover improved designs of systems or to extract significant concepts from code. Clustering is an old, highly sophisticated, activity which offers many methods to meet different needs. The various methods have been well documented in the past; however, conclusions from general clustering literature may not apply entirely to the reverse engineering domain. In the paper, the authors study three decisions that need to be made when clustering: the choice of (i) abstract descriptions of the entities to be clustered, (ii) metrics to compute coupling between the entities, and (iii) clustering algorithms. For each decision, our objective is to understand which choices are best when performing software remodularisation. The experiments were conducted on three public domain systems (gcc, Linux and Mosaic) and a real world legacy system (2 million LOC). Among other things, the authors confirm the importance of a proper description scheme for the entities being clustered, list a few effective coupling metrics and characterise the quality of different clustering algorithms. They also propose description schemes not directly based on the source code, and advocate better formal evaluation methods for the clustering results.

Related content

This is a required field
Please enter a valid email address