Rk-means: Fast Clustering for Relational Data

This work expanded the class of ML models one can train relationally to some unsupervised models.

Authors: Ryan R. Curtin, Benjamin Moseley, Hung Q. Ngo, XuanLong Nguyen, Dan Olteanu, Maximilian Schleich. 2020.

In Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS ‘20).

Conventional machine learning algorithms cannot be applied until a data matrix is available to process. When the data matrix needs to be obtained from a relational database via a feature extraction query, the computation cost can be prohibitive, as the data matrix may be (much) larger than the total input relation size. This paper introduces Rk-means, or relationalk-means algorithm, for clustering relational data tuples without having to access the full data matrix.

Read the PDF: Rk-means: Fast Clustering for Relational Data