Projects per year
Abstract
Selecting an appropriate library for reuse within a vast software ecosystem can be a daunting task. A list of features for each library, i.e., a short description of the functionality that can be reused with code examples that illustrate its usage, may
alleviate this problem. In this paper, we propose a data-driven approach that uses both the code snippets and the accompanying natural language descriptions from Stack Overflow posts to produce a list of features of a given library. Each extracted feature corresponds to a cluster of API classes and methods considered related based on attributes of the Stack Overflow posts in which they appear. We evaluated the approach considering seven Maven libraries and compared the resulting features against library descriptions from cookbook-like tutorials. The approach achieves
an average accuracy of 67% across the seven libraries for the tutorial-like features. For at least 73% of the features extracted by the approach but missing from the documentation, we found a matching library usage in a corpus of GitHub projects. These results suggest that our clusters represent library features, which
paves the way to better tool support for documenting software libraries and for selecting a library in an ecosystem.
alleviate this problem. In this paper, we propose a data-driven approach that uses both the code snippets and the accompanying natural language descriptions from Stack Overflow posts to produce a list of features of a given library. Each extracted feature corresponds to a cluster of API classes and methods considered related based on attributes of the Stack Overflow posts in which they appear. We evaluated the approach considering seven Maven libraries and compared the resulting features against library descriptions from cookbook-like tutorials. The approach achieves
an average accuracy of 67% across the seven libraries for the tutorial-like features. For at least 73% of the features extracted by the approach but missing from the documentation, we found a matching library usage in a corpus of GitHub projects. These results suggest that our clusters represent library features, which
paves the way to better tool support for documenting software libraries and for selecting a library in an ecosystem.
Original language | English |
---|---|
Title of host publication | Proceedings of the 29th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER 2022) |
Publisher | IEEE |
Pages | 207-217 |
Number of pages | 11 |
ISBN (Electronic) | 978-1-6654-3786-8 |
ISBN (Print) | 978-1-6654-3787-5 |
DOIs | |
Publication status | Published - 2022 |
Event | 29th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER 2022) - University of Hawaii, Honolulu, United States Duration: 15 Mar 2022 → 18 Mar 2022 Conference number: 29th https://saner2022.uom.gr/ |
Publication series
Name | Proceedings - 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2022 |
---|
Conference
Conference | 29th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER 2022) |
---|---|
Abbreviated title | SANER |
Country | United States |
City | Honolulu |
Period | 15/03/22 → 18/03/22 |
Internet address |
Keywords
- machine learning
- software ecosystems
- program comprehension
Fingerprint
Dive into the research topics of 'Uncovering Library Features from API Usage on Stack Overflow'. Together they form a unique fingerprint.Projects
- 1 Finished
-
FWOEOS10: Automated Assistance for Developing Software in Ecosystems of the Future
De Roover, C., Mens, T., Demeyer, S. & Cleve, A.
1/01/18 → 31/12/21
Project: Fundamental
Prizes
-
Distinguished Paper Award to: Uncovering Library Features from API Usage on Stack Overflow
Velazquez Rodriguez, Camilo Ernesto (Recipient), Constantinou, E. (Recipient) & De Roover, Coen (Recipient), 15 Mar 2022
Prize: Prize (including medals and awards)
File