Exploring Static Inter-Procedural API Misuse Detection Using Graph Inlining

Scriptie/masterproef: Master's Thesis


Programming against APIs is difficult. Application Programming Interfaces are often large, complicated, and specify implicit contracts which are not readily documented. Therefore, programmers often make mistakes when using API methods, which may lead to bugs and severe complications involving crashes, data loss, or security vulner- abilities.
To alleviate such difficulties, a lot of research has gone into automatically inferring API rules and usage patterns, and statically detect violations of such rules and patterns in a target project. Such tools are often called API misuse detectors. The state-of-the-art misuse detectors use pattern mining to infer correct usages of APIs, and identify misuses of the API as usages that deviate from the mined patterns. Recent work has explored graph-based representations to model such usages, but neglects inter- procedural analysis. This causes a significant number of false positives in their findings, due to missing graph elements that are present in transitively called methods. Such false positives decrease the precision of the tools, and is one of the reasons they are still difficult to use in practice.
In this dissertation, we present two main contributions to the field of static API misuse detection. As our first contribution, we incorporate inter-procedural analysis into graph-based pattern mining and violation detection through a process called graph inlining. This embeds the graph representation of a called method into the graph representation of its caller. To properly integrate the calleeā€™s graph, we must carefully consider code semantics, and thus, our approach considers many different types of control flow and data flow in its inlining algorithm. To customise this inlining process, we develop six inlining strategies that differ in the way they handle recursive calls, duplicate calls, and calls to methods that do not contain any direct API usage. These strategies lead to different patterns, and thus to different detected misuses.
Our second contribution is a general filtering algorithm to eliminate inter-procedural false positives. It is capable of removing incorrect violations due to missing pattern elements that are present in the call context surrounding the violating method, as well as removing duplicate violations that may occur due to inter-procedural analysis. This filtering algorithm inspects the callers and callees of a method to determine whether the violation reported in this method is likely a true misuse, and is based on the key insight that if a callee contains a violation, then all of its callers contain the same violation after inlining, and thus, the callers contain duplicated violations.
We evaluate our approach on a dataset of nine real Java projects containing actual misuses of APIs. We find that our approach effectively mines inter-procedural patterns, allowing it to identify new misuses, and also correctly eliminates inter-procedural false positives from its findings. We identify a number of important limitations of our work, including the prevalence of false positive misuses due to uncommon but correct alternative usages, as well as the increased cost of pattern mining and violation detection in the larger graphs produced by inlining. We investigate these limitations further, and pave the way to overcoming these issues in the future.
Datum Prijs9 sep 2019
BegeleiderCoen De Roover (Promotor) & Camilo Ernesto Velazquez Rodriguez (Advisor)

Citeer dit