Automatic library categorization

Research output: Unpublished contribution to conferenceUnpublished abstract

1 Citation (Scopus)


Software ecosystems contain several types of artefacts such as libraries, documentation and source code files. Recent studies show that the Maven software ecosystem alone already contains over 2.8 million artefacts and over 70, 000 libraries. Given the size of the ecosystem, selecting a library represents a challenge to its users.

The MVNRepository website offers a category-based search func- tionality as a solution. However, not all of the libraries have been categorised, which leads to incomplete search results. This work proposes an approach to the automatic categorisation of libraries through machine learning classifiers trained on class and method names. Our preliminary results show that the approach is accurate, suggesting that large-scale applications may be feasible.
Original languageEnglish
Number of pages2
Publication statusPublished - 2020
Event3rd International Workshop on Software Health - , Korea, Republic of
Duration: 5 Oct 202011 Oct 2020


Workshop3rd International Workshop on Software Health
Country/TerritoryKorea, Republic of


  • Software Ecosystems
  • API Category
  • Text Classification


Dive into the research topics of 'Automatic library categorization'. Together they form a unique fingerprint.

Cite this