This project aims to cover Brussels, metaphorically speaking of course, with a cloud of structured and interlinked information elements produced by "atomizing" a collection of relevant databases and other resources. Linked Data is a global initiative to interlink resources on the Web using two "simple" technologies: Uniform Resource Identifiers (URI) for accessing the resources and the Resource Description Framework (RDF) for representing knowledge and annotating those resources. Various governments (UK, US, Germany) are launching initiatives to (a) make public sector data easily available with Linked Data and (b) encourage researchers to analyze the data as well as application developers to build applications around that data, in order to stimulate innovation, business and the general wealth of society. We believe the time and circumstances are right for Brussels to prepare for this evolution.
Clearly the meaningful exploitation of such interlinked cloud-type resources requires the adoption of semantic technology, in particular so-called ontologies, that contain computer representations of agreements on concepts, facts and rules. To this end we propose to provide a platform for communities to collaboratively create such ontologies through natural language (NL). Those ontologies can then be used for the publication of annotated information in databases on the Web, and/or assist the automatic annotation of resources such as videos and images. The multilingual nature of Brussels presents an additional challenge to be addressed. Each member of the consortium is leading one of the four interconnected pillars of this project (see below). Members not only are top international-level experts in their domain, but also all have collaborated earlier (usually pairwise) in EU or local projects. The four pillars constituting the whole platform are:
Ontology Creation, led by VUB STARLab. Expertise in methods and tools for ontology construction, based upon proven DB modeling techniques partly using NL. Applied to create a framework for knowledge management in which communities (stakeholders, adopters and users) play a vital role. This pillar mainly focuses on methodology and technology for guiding the community and its discourse to result in an agreed representation.
Automatic Annotation, led by VUB ETRO. Large expertise in image and video analysis and multimedia representation and annotation. Applied to object and event recognition in multimodal databases using the ontologies created in the first pillar. Such annotation would allow language independent queries on those databases, when the ontology will be aligned with a multilingual terminology base created in the third pillar. Image querying to be developed according to the upcoming JPSearch standard, resulting in interoperability between query processor and various (semantically annotated) multimedia repositories.
Multilingual Terminology. Erasmushogeschool CVC will construct a multilingual terminology base within a given domain (including the variance between two terms in different languages). Such a terminology base can assist the automatic interlinking of different annotated resources based on their content. Makes navigation possible from one document in a language to a related document in another language.
Atomization of Databases. Led by ULB WIT who will investigate how heterogeneous sources or implicit information in large databases may be scalably converted and exposed as RDF triples on the Web. This will in particular be applied to spatio-temporal knowledge, as such type of resource is becoming increasingly important for linking between various domains.
The software artifacts of each pillar are also intended to be valorized independently, and potential for use cases of the whole platform is vast. The OSCB platform can drive a wide spectrum of services to be developed for the government, the private sector or the public. Valorization is two-tiered: indirectly by the stakeholders (e.g., governments) who will be able provide new "business" services to the public, and directly by software service providers who for this purpose will exploit the linkages created by the semantic cloud of RDF triples. Some motivational proof-of-concept applications are: (1) the linking of CV templates, annotated video courses and vacant positions to allow competency matching, (2) language independent querying on annotated image databases (possibly automatically enriched with user tags, geo-tags, etc.) and (3) publishing public sector data on schools that, e.g., assists parents to locate suitable schools or enable industry/employment agencies to engage productively and proactively with technical schools in discussions about curricula content.