Rk-means: Fast Clustering for Relational Data
This work expanded the class of ML models one can train relationally to some unsupervised models.
Authors: Ryan R. Curtin, Benjamin Moseley, Hung Q. Ngo, XuanLong Nguyen, Dan Olteanu, Maximilian Schleich. 2020.
In Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS ‘20).
Conventional machine learning algorithms cannot be applied until a data matrix is available to process. When the data matrix needs to be obtained from a relational database via a feature extraction query, the computation cost can be prohibitive, as the data matrix may be (much) larger than the total input relation size. This paper introduces Rk-means, or relationalk-means algorithm, for clustering relational data tuples without having to access the full data matrix.
Read the PDF: Rk-means: Fast Clustering for Relational Data (opens in a new tab)
Related Posts
Defensive Points-To Analysis: Effective Soundness via Laziness
In this work, we present a more principled approach for identifying precision-critical methods, based on general patterns of value flows that explain where most of the imprecision arises in context-insensitive pointer analysis.
Worst-Case Optimal Join Algorithms: Techniques, Results and Open Problems
Worst-case optimal join algorithms are the class of join algorithms whose runtime match the worst-case output size of a given join query. While the first provably worst-case optimal join algorithm was discovered relatively recently, the techniques and results surrounding these algorithms grow out of decades of research from a wide range of areas, intimately connecting graph theory, algorithms, information theory, constraint satisfaction, database theory, and geometric inequalities.