A Language-Parametric Modular Framework for Mining Idiomatic Code Patterns

Hoang Son Pham, Siegfried Nijssen, Kim Mens, Dario Di Nucci, Tim Christiaan Molderez, Coen De Roover, Johan Fabry, Vadim Zaytsev

Research output: Chapter in Book/Report/Conference proceedingMeeting abstract (Book)Research

Abstract

In an ongoing industry-university collaboration we are developing a language-parametric framework for mining code idioms in legacy systems. This modular framework has a pipeline architecture and a language- parametric meta representation of the artefacts used by each of its 5 components: source code importer, mining preprocessor, pattern miner, pattern matcher, and modernisation assistant. The pipeline enables reuse of its components across systems and languages, as well as for project partners to work on each of these components separately. An example is the exploration of novel pattern mining techniques independently of the languages on which they will be applied and the modernisation assistant in which they will be used. Our first results on mining Java and COBOL code are promising, even though challenges still lie ahead to make the framework and its constituting components truly scalable, customisable, and language independent.
Original languageEnglish
Title of host publicationSeminar on Advanced Techniques & Tools for Software Evolution
Number of pages6
Volume1
Publication statusPublished - 2019
Event12th Seminar on Advanced Techniques & Tools for Software Evolution - Free University of Bozen-Bolzano, Bolzano, Italy
Duration: 8 Jul 2019 → …
Conference number: 12
http://sattose.org/2019

Publication series

NameCEUR Workshop Proceedings
ISSN (Print)1613-0073

Conference

Conference12th Seminar on Advanced Techniques & Tools for Software Evolution
Abbreviated titleSATToSE
CountryItaly
CityBolzano
Period8/07/19 → …
Internet address

Keywords

  • Pattern Mining
  • Frequent Tree Mining
  • Source Code Regularities

Fingerprint Dive into the research topics of 'A Language-Parametric Modular Framework for Mining Idiomatic Code Patterns'. Together they form a unique fingerprint.

Cite this