Shapes Constraint Language: formalization, expressiveness, and provenance

Maxime Jakubowski

Research output: ThesisPhD Thesis

48 Downloads (Pure)

Abstract

The Shapes Constraint Language (SHACL) is a W3C-proposed schema language for expressing structural constraints on RDF graphs. Constraints on nodes are called “shapes”, and when shapes are coupled with so-called “target declarations”, specifying which nodes need to adhere to which shapes, we have a complete constraint on RDF graphs. We study several aspects of this language. First, recent formalizations show a striking resemblance with description logics. We build on top of these formalizations to come to an understanding of SHACL as a logic. Furthermore, because the SHACL specification only defines semantics for non-recursive SHACL, some efforts have been made to allow recursive SHACL schemas. We argue that for defining and studying semantics of recursive SHACL, lessons can be learned from research in non-monotonic reasoning. We look at the proposed semantics from the literature and compare it with techniques from well-established research from non-monotonic reasoning.

Next, SHACL expressions can use three fundamental features that are not so common in similar logics. These features are equality tests; disjointness tests; and closure constraints. It is not clear how the presence of these non-standard features impacts the expressiveness
of SHACL. We show that each of the three features is primitive: using the feature, one can express boolean queries that are not expressible without using the feature. We also show that the restriction that SHACL imposes on allowed targets is inessential, as long as closure
constraints are not used. In addition, we show that enriching SHACL with “full” versions of equality tests, or disjointness tests, results in a strictly more powerful language.

Lastly, we propose provenance semantics for SHACL. We propose the notion of neighborhood of a node v satisfying a given shape in a graph G. This neighborhood is a subgraph of G, and provides data provenance of v for the given shape. We establish a correctness property for the obtained provenance mechanism, by proving that neighborhoods adhere to the Sufficiency requirement articulated for provenance semantics for database queries. As an additional benefit, neighborhoods allow a novel use of shapes: the extraction of a subgraph
from an RDF graph, the so-called shape fragment. We compare shape fragments with SPARQL queries. We discuss implementation strategies for computing neighborhoods, and present initial experiments demonstrating that our ideas are feasible.
Original languageEnglish
Awarding Institution
  • Vrije Universiteit Brussel
  • UHasselt
  • Maastricht University
Supervisors/Advisors
  • Van den Bussche, Jan, Supervisor, External person
  • Bogaerts, Bart, Supervisor
Award date31 May 2024
Publication statusPublished - 2024

Fingerprint

Dive into the research topics of 'Shapes Constraint Language: formalization, expressiveness, and provenance'. Together they form a unique fingerprint.

Cite this