RelationalAI
01 January 2020
less than a minute read
The query containment problem is a fundamental algorithmic problem in data management. While this problem is well understood under set semantics, it is by far less understood under bag semantics. In this paper we unveil tight connections between information theory and the conjunctive query containment under bag semantics.
Authors: Mahmoud Abo Khamis, Phokion G. Kolaitis, Hung Q. Ngo, Dan Suciu. 2020.
In Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (PODS ‘20).
The query containment problem is a fundamental algorithmic problem in data management. While this problem is well understood under set semantics, it is by far less understood under bag semantics. In particular, it is a long-standing open question whether or not the conjunctive query containment problem under bag semantics is decidable.We unveil tight connections between information theory and the conjunctive query containment under bag semantics.These connections are established using information inequalities, which are considered to be the laws of information theory. Our first main result asserts that deciding the validity of a generalization of information inequalities is many-one equivalent to the restricted case of conjunctive query containment in which the containing query is acyclic; thus, either both these problems are decidable or both are undecidable. Our second main result identifies a new decidable case of the conjunctive query containment problem under bag semantics. Specifically, we give an exponential-time algorithm for conjunctive query containment under bag semantics, provided the containing query is chordal and admits a simple junction tree.
We defined named entity recognition (NER) in the legal domain and presented our approach towards generating ground truth data. In what follows, we go over the state-of-the-art in the NER domain and elaborate on the experiments we ran and the lessons we learned.
Read MoreNamed entity recognition is a difficult challenge to solve, particularly in the legal domain. Extracting ground truth labels from long, hierarchical documents is often slow and prone to error. RelationalAI proposes a new, scalable algorithm based on the principles of data-centric AI, designed to meet this challenge and generate high-quality annotations with minimal supervision.
Read MoreMolham shares some history of relational databases, trends in modern cloud-native database systems, and the innovations pioneered at RelationalAI to bring deep learning with relations from idea to reality.
Read More