paper.pdf (663.33 kB)

AndroLibZoo: A Reliable Dataset of Libraries Based on Software Dependency Analysis

Download (663.33 kB)
conference contribution
posted on 2024-02-27, 09:49 authored by Jordan SamhiJordan Samhi, Tegawendé Bissyandé, Jacques Klein
Android app developers extensively employ code reuse, integrating many third-party libraries into their apps. While such integration is practical for developers, it can be challenging for static analyzers to achieve scalability and precision when libraries account for a large part of the code. As a direct consequence, it is common practice in the literature to consider developer code only during static analysis --with the assumption that the sought issues are in developer code rather than the libraries. However, analysts need to distinguish between library and developer code. Currently, many static analyses rely on white lists of libraries. However, these white lists are unreliable, inaccurate, and largely non-comprehensive. In this paper, we propose a new approach to address the lack of comprehensive and automated solutions for the production of accurate and ``always up to date" sets of libraries. First, we demonstrate the continued need for a white list of libraries. Second, we propose an automated approach to produce an accurate and up-to-date set of third-party libraries in the form of a dataset called AndroLibZoo. Our dataset, which we make available to the community, contains to date 34 813 libraries and is meant to evolve.


Primary Research Area

  • Secure Connected and Mobile Systems

Name of Conference

The International Conference on Mining Software Repositories (MSR)


@conference{Samhi:Bissyandé:Klein, title = "AndroLibZoo: A Reliable Dataset of Libraries Based on Software Dependency Analysis", author = "Samhi, Jordan" AND "Bissyandé, Tegawendé" AND "Klein, Jacques" }

Usage metrics


    No categories selected



    Ref. manager