Learning Models over Relational Data using Sparse Tensors and Functional Dependencies

This paper developed the theoretical foundation of discriminative learning over relational data, showing how one can exploit the relational structures in the data and the feature extraction query to speed up model training.

Mahmoud Abo Khamis, Hung Q. Ngo, XuanLong Nguyen, Dan Olteanu, Maximilian Schleich. 2020.

In ACM Transactions on Database Systems (TODS ‘20). Vol. 45 No. 2, Article 7.

Integrated solutions for analytics over relational databases are of great practical importance as they avoid the costly repeated loop data scientists have to deal with on a daily basis: select features from data residing in relational databases using feature extraction queries involving joins, projections, and aggregations; export the training dataset defined by such queries; convert this dataset into the format of an external learning tool; and train the desired model using this tool. These integrated solutions are also a fertile ground of theoretically fundamental and challenging problems at the intersection of relational and statistical data models.

Read the PDF: Learning Models over Relational Data using Sparse Tensors and Functional Dependencies