Visual data mining and analysis of software repositories

S.L. Voinea, A.C. Telea

Research output: Contribution to journalArticleAcademicpeer-review

23 Citations (Scopus)
2 Downloads (Pure)

Abstract

In this article we describe an ongoing effort to integrate information visualization techniques into the process of configuration management for software systems. Our focus is to help software engineers manage the evolution of large and complex software systems by offering them effective and efficient ways to query and assess system properties using visual techniques. To this end, we combine several techniques from different domains, as follows. First, we construct an infrastructure that allows generic querying and data mining of different types of software repositories such as CVS and Subversion. Using this infrastructure, we construct several models of the software source code evolution at different levels of detail, ranging from project and package up to function and code line. Second, we describe a set of views that allow examining the code evolution models at different levels of detail and from different perspectives. We detail three views: the file view shows changes at line level across many versions of a single, or a few, files. The project view shows changes at file level across entire software projects. The decomposition view shows changes at subsystem level across entire projects. We illustrate how the proposed techniques, which we implemented in a fully operational toolset, have been used to answer non-trivial questions on several real-world, industry-size software projects. Our work is at the crossroads of applied software engineering (SE) and information visualization, as our toolset aims to tightly integrate the methods promoted by the InfoVis field into the SE practice.
Original languageEnglish
Pages (from-to)410-428
JournalComputers and Graphics
Volume31
Issue number3
DOIs
Publication statusPublished - 2007

Fingerprint Dive into the research topics of 'Visual data mining and analysis of software repositories'. Together they form a unique fingerprint.

Cite this