The Docker Hub Image Inheritance Network: Construction and Empirical Insights

Research output: Chapter in Book/Report/Conference proceedingConference paperResearch

1 Citation (Scopus)

Abstract

Docker is a popular technology to containerise applications together with their dependencies into reproducible environments. In Docker, container images can depend on others through inheritance. Such inheritance can propagate bad practices and security vulnerabilities from a parent image to its children. Unfortunately, Docker Hub, the most popular online registry of images, lacks transparency about such inheritance. This obscures the software supply chain, possibly leaving image users unaware of quality or security issues caused by parent images. Nonetheless, we found inheritance on Docker Hub to be an understudied topic in academia to date. Therefore, the goal of this paper is to empirically investigate the practice of image inheritance on Docker Hub. To this end, we collect a dataset of 636,625 unique images belonging to popular Docker repositories and identify inheritance by comparing the images’ layers. We leverage the constructed inheritance network to empirically investigate three aspects, namely the structure of the inheritance network, how child images differ from their parents, and outdatedness of parent images. Our results show that most popular community Docker Hub images directly inherit from official images rather than other community ones. We also observe that community child images are often much larger than their parent, in comparison to official child images. This may indicate the existence of gaps between the features provided by official images and those required by consumers, suggesting the need for more ready-made parent images. Finally, we find that around half of the child images use an outdated parent image at the time the child is built, although time lag is usually less than a month. However, time lag becomes much larger when we compare against the latest version of the parent image available at the analysis date, with up to 70% of child images using an outdated parent image and a median of over 5 months of time lag. This indicates that users should pay attention to the lineage of the images they consume, and motivates future work on alleviating technical lag in Docker images.
Original languageEnglish
Title of host publicationProceedings of the 23rd IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2023)
EditorsLeon Moonen, Christian Newman, Alessandra Gorla
PublisherIEEE
Pages198-208
Number of pages11
ISBN (Electronic)979-8-3503-0506-7
DOIs
Publication statusPublished - 1 Oct 2023
Event23rd IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2023) - Bogotá, Colombia
Duration: 2 Oct 20233 Oct 2023
Conference number: 23
https://www.ieee-scam.org/2023/

Publication series

NameProceedings - 2023 IEEE 23rd International Working Conference on Source Code Analysis and Manipulation, SCAM 2023

Conference

Conference23rd IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2023)
Abbreviated titleSCAM
Country/TerritoryColombia
CityBogotá
Period2/10/233/10/23
Internet address

Bibliographical note

Funding Information:
ACKNOWLEDGEMENTS This research was partially funded by the "Cybersecurity Initiative Flanders" project and the Research Foundation Flanders (FWO) under Grant No. 1SD4321N and V431423N.

Publisher Copyright:
© 2023 IEEE.

Keywords

  • Docker
  • Docker Hub
  • software supply chain
  • software ecosystems
  • inheritance network
  • technical lag

Fingerprint

Dive into the research topics of 'The Docker Hub Image Inheritance Network: Construction and Empirical Insights'. Together they form a unique fingerprint.

Cite this