Modern Datalog Engines

Bas Ketsman, Paraschos Koutris

Research output: Contribution to journalArticlepeer-review

Abstract

Recent years have seen a resurgence of interest from both the industry and research community in Datalog. Datalog is a declarative query language that extends relational algebra with recursion. It has been used to express a wide spectrum of modern data management tasks, such as data integration, declarative networking, graph analysis, business analytics, and program analysis. The result of this long line of research is a plethora of Datalog engines, which support different variants of Datalog, and have different technical specifications and capabilities. In this monograph, we provide an overview of the architecture and technical characteristics of these Datalog engines. We identify common architectural decisions and evaluation methods, as well as data structures and layouts used to speed up the query execution. We also discuss in what ways Datalog engines differ when they specialize to workloads with different characteristics (for example, data analytics vs program analysis vs graph analysis). One particular focus is how modern Datalog engines scale to massively parallel environments.
Original languageEnglish
Pages (from-to)1-68
Number of pages68
JournalFoundations and Trends® in Databases
Volume12
Issue number1
DOIs
Publication statusPublished - 2022

Keywords

  • Data Models and Query Languages
  • Database Theory
  • Parallel and Distributed Database Systems
  • Query Processing and Optimization

Fingerprint

Dive into the research topics of 'Modern Datalog Engines'. Together they form a unique fingerprint.

Cite this