Mining Patterns in Source Code using Tree Mining Algorithms

Hoang Son Pham, Siegfried Nijssen, Kim Mens, Dario Di Nucci, Tim Christiaan Molderez, Coen De Roover, Johan Fabry, Vadim Zaytsev

Research output: Chapter in Book/Report/Conference proceedingConference paper

5 Citations (Scopus)
50 Downloads (Pure)


Discovering regularities in source code is of great interest to software engineers, both in academia and in industry, as regularities can provide useful information to help in a variety of tasks such as code comprehension, code refactoring, and fault localisation. However, traditional pattern mining algorithms often find too many patterns of little use and hence are not suitable for discovering useful regularities. In this paper we propose FREQTALS, a new algorithm for mining patterns in source code based on the FREQT tree mining algorithm. First, we introduce several constraints that effectively enable us to find more useful patterns; then, we show how to efficiently include them in FREQT. To illustrate the usefulness of the constraints we carried out a case study in collaboration with software engineers, where we identified a number of interesting patterns in a repository of Java code.
Original languageEnglish
Title of host publicationProceedings of the 22nd International Conference on Discovery Science (DS2019)
PublisherDS2019: 22nd International Conference on Discovery Science
Number of pages10
ISBN (Print)978-3-030-33778-0
Publication statusPublished - 28 Oct 2019
Event22nd International Conference on Discovery Science - Split, Croatia
Duration: 28 Oct 202030 Oct 2020
Conference number: 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11828 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference22nd International Conference on Discovery Science
Abbreviated titleDS
Internet address


  • Pattern Mining
  • Frequent Tree Mining
  • Source Code Regularities


Dive into the research topics of 'Mining Patterns in Source Code using Tree Mining Algorithms'. Together they form a unique fingerprint.

Cite this