Projects per year
Abstract
Recent studies show that the Maven ecosystem alone already contains over 2 million library artefacts including their source code, byte code, and documentation. To help developers cope with this information, several websites overlay configurable views on the ecosystem. For instance, views in which similar libraries are grouped into categories or views showing all libraries that have been tagged with tags corresponding to coarse-grained library features. The MVNRepository overlay website offers both category-based and tag-based views. Unfortunately, several libraries have not been categorised or are missing relevant tags. Some initial approaches to the automated categorisation of Maven libraries have already been proposed. However, no such approach exists for the problem of tagging of libraries in a multi-label setting.
This paper proposes MUTAMA, a multi-label classification approach to the Maven library tagging problem based on information extracted from the byte code of each library. We analysed 4088 randomly selected libraries from the Maven software ecosystem. MUTAMA trains and deploys five multi-label classifiers using feature vectors obtained from class and method names of the tagged libraries. Our results indicate that classifiers based on ensemble methods achieve the best performances. Finally, we propose directions to follow in this area.
This paper proposes MUTAMA, a multi-label classification approach to the Maven library tagging problem based on information extracted from the byte code of each library. We analysed 4088 randomly selected libraries from the Maven software ecosystem. MUTAMA trains and deploys five multi-label classifiers using feature vectors obtained from class and method names of the tagged libraries. Our results indicate that classifiers based on ensemble methods achieve the best performances. Finally, we propose directions to follow in this area.
Original language | English |
---|---|
Title of host publication | Proceedings of the 20th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM2020) |
Publisher | IEEE |
Pages | 243-247 |
Number of pages | 5 |
ISBN (Electronic) | 9781728192482 |
ISBN (Print) | 978-1-7281-9248-2 |
DOIs | |
Publication status | Published - Sep 2020 |
Event | 20th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM2020) - Adelaide, Australia Duration: 27 Sep 2020 → 28 Sep 2020 |
Publication series
Name | Proceedings - 20th IEEE International Working Conference on Source Code Analysis and Manipulation, SCAM 2020 |
---|
Conference
Conference | 20th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM2020) |
---|---|
Country | Australia |
City | Adelaide |
Period | 27/09/20 → 28/09/20 |
Keywords
- multi-label classification
- libraries
- software ecosystems
- machine learning
- software engineering
Fingerprint
Dive into the research topics of 'MUTAMA: An Automated Multi-label Tagging Approach for Software Libraries on Maven'. Together they form a unique fingerprint.Projects
- 1 Finished
-
FWOEOS10: Automated Assistance for Developing Software in Ecosystems of the Future
De Roover, C., Mens, T., Demeyer, S. & Cleve, A.
1/01/18 → 31/12/21
Project: Fundamental
Activities
- 1 Participation in conference
-
20th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM2020)
Camilo Ernesto Velazquez Rodriguez (Participant)
27 Sep 2020 → 28 Sep 2020Activity: Participating in or organising an event › Participation in conference