\n","children":[{"type":"text","text":""}]},{"type":"p","children":[{"type":"text","text":"Efficient join processing is one of the most fundamental and well-studied tasks\nin database research. In this work, we examine algorithms for natural join\nqueries over many relations and describe a new algorithm to process these\nqueries optimally in terms of worst-case data complexity.","italic":true}]},{"type":"p","children":[{"type":"text","text":"Authors: Hung Q. Ngo, Ely Porat, Christopher Ré, Atri Rudra. 2012."}]},{"type":"p","children":[{"type":"text","text":"In Proceedings of the 31st ACM SIGMOD-SIGACT-SIGAI symposium on Principles of\nDatabase Systems (PODS ‘12). (Best Paper Award).","italic":true}]},{"type":"p","children":[{"type":"text","text":"Efficient join processing is one of the most fundamental and well-studied tasks\nin database research. In this work, we examine algorithms for natural join\nqueries over many relations and describe a new algorithm to process these\nqueries optimally in terms of worst-case data complexity. Our result builds on\nrecent work by Atserias, Grohe, and Marx, who gave bounds on the size of a\nnatural join query in terms of the sizes of the individual relations in the body\nof the query. These bounds, however, are not constructive: they rely on\nShearer’s entropy inequality, which is information-theoretic. Thus, the previous\nresults leave open the question of whether there exist algorithms whose runtimes\nachieve these optimal bounds. An answer to this question may be interesting to\ndatabase practice, as we show in this article that any project-join style plans,\nsuch as ones typically employed in a relational database management system, are\nasymptotically slower than the optimal for some queries. We present an algorithm\nwhose runtime is worst-case optimal for all natural join queries. Our result may\nbe of independent interest, as our algorithm also yields a constructive proof of\nthe general fractional cover bound by Atserias, Grohe, and Marx without using\nShearer’s inequality. This bound implies two famous inequalities in geometry:\nthe Loomis-Whitney inequality and its generalization, the Bollobás-Thomason\ninequality. Hence, our results algorithmically prove these inequalities as well.\nFinally, we discuss how our algorithm can be used to evaluate full conjunctive\nqueries optimally, to compute a relaxed notion of joins and to optimally (in the\nworst-case) enumerate all induced copies of a fixed subgraph inside of a given\nlarge graph."}]},{"type":"p","children":[{"type":"text","text":"Read the PDF:\n"},{"type":"a","url":"https://dl.acm.org/doi/pdf/10.1145/3180143","title":null,"children":[{"type":"text","text":"Worst-case Optimal Join Algorithms"}]}]}],"_content_source":{"queryId":"src/content/resources/worst-case-optimal-join-algorithms.mdx","path":["resource","body"]}},"_content_source":{"queryId":"src/content/resources/worst-case-optimal-join-algorithms.mdx","path":["resource"]}}},"errors":null,"query":"\n query resource($relativePath: String!) {\n resource(relativePath: $relativePath) {\n ... on Document {\n _sys {\n filename\n basename\n breadcrumbs\n path\n relativePath\n extension\n }\n id\n }\n ...ResourceParts\n }\n}\n \n fragment ResourceParts on Resource {\n __typename\n title\n description\n date\n image\n categories\n authors {\n __typename\n name\n link\n }\n seo {\n __typename\n keywords\n description\n image\n image_alt\n canonical_url\n author\n published\n modified\n language\n robots\n site_name\n content_type\n }\n body\n}\n ","variables":{"relativePath":"worst-case-optimal-join-algorithms.mdx"}},"src/content/meta/meta.md":{"data":{"meta":{"_sys":{"filename":"meta","basename":"meta.md","breadcrumbs":["meta"],"path":"src/content/meta/meta.md","relativePath":"meta.md","extension":".md"},"id":"src/content/meta/meta.md","__typename":"Meta","banner":{"__typename":"MetaBanner","enabled":true,"content":{"type":"root","children":[{"type":"p","children":[{"type":"text","text":"Check out "},{"type":"a","url":"/resources/highlights-of-relationalai-at-snowflake-data-cloud-summit-2024","title":"SF summit highlights","children":[{"type":"text","text":"highlights"}]},{"type":"text","text":" of RelationalAI at "},{"type":"text","text":"Snowflake's Data Cloud Summit 2024!","bold":true}]}],"_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","banner","content"]}},"_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","banner"]}},"header":{"__typename":"MetaHeader","links":[{"__typename":"MetaHeaderLinks","text":"Product","url":"/product","style":"default","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","header","links",0]}},{"__typename":"MetaHeaderLinks","text":"Company","url":"/company","style":"default","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","header","links",1]}},{"__typename":"MetaHeaderLinks","text":"Docs","url":"/docs","style":"default","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","header","links",2]}},{"__typename":"MetaHeaderLinks","text":"Resources","url":"/resources/all/1","style":"default","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","header","links",3]}},{"__typename":"MetaHeaderLinks","text":"Get Started","url":"/get-started","style":"cta","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","header","links",4]}}],"_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","header"]}},"footer":{"__typename":"MetaFooter","sections":[{"__typename":"MetaFooterSections","name":"Product","links":[{"__typename":"MetaFooterSectionsLinks","text":"Overview","url":"/product","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",0,"links",0]}},{"__typename":"MetaFooterSectionsLinks","text":"Use Cases","url":"/product#for-problems-that-matter","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",0,"links",1]}},{"__typename":"MetaFooterSectionsLinks","text":"Capabilities","url":"/product#a-new-toolset","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",0,"links",2]}}],"_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",0]}},{"__typename":"MetaFooterSections","name":"Resources","links":[{"__typename":"MetaFooterSectionsLinks","text":"Documentation","url":"/docs/getting_started","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",1,"links",0]}},{"__typename":"MetaFooterSectionsLinks","text":"News","url":"/resources/news/1","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",1,"links",1]}},{"__typename":"MetaFooterSectionsLinks","text":"Research","url":"/resources/research/1","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",1,"links",2]}},{"__typename":"MetaFooterSectionsLinks","text":"Releases","url":"/resources/releases/1","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",1,"links",3]}}],"_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",1]}},{"__typename":"MetaFooterSections","name":"About Us","links":[{"__typename":"MetaFooterSectionsLinks","text":"Our Company","url":"/company","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",2,"links",0]}},{"__typename":"MetaFooterSectionsLinks","text":"Contact Us","url":"/get-started","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",2,"links",1]}},{"__typename":"MetaFooterSectionsLinks","text":"Careers","url":"/careers","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",2,"links",2]}},{"__typename":"MetaFooterSectionsLinks","text":"Legal","url":"/legal","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",2,"links",3]}},{"__typename":"MetaFooterSectionsLinks","text":"GDPR","url":"/gdpr","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",2,"links",4]}},{"__typename":"MetaFooterSectionsLinks","text":"Security & Trust","url":"https://trust.relational.ai/","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",2,"links",5]}}],"_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",2]}}],"socials":[{"__typename":"MetaFooterSocials","text":"GitHub","url":"https://github.com/RelationalAI","icon":"https://assets.tina.io/91d76337-e55d-4722-acb5-3106adb895b6/img/logos/github.png","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","socials",0]}},{"__typename":"MetaFooterSocials","text":"LinkedIn","url":"https://www.linkedin.com/company/relationalai/about","icon":"https://assets.tina.io/91d76337-e55d-4722-acb5-3106adb895b6/img/logos/linkedin.png","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","socials",1]}},{"__typename":"MetaFooterSocials","text":"Twitter","url":"https://twitter.com/relationalai","icon":"https://assets.tina.io/91d76337-e55d-4722-acb5-3106adb895b6/img/logos/twitter.png","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","socials",2]}}],"_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer"]}},"_content_source":{"queryId":"src/content/meta/meta.md","path":["meta"]}}},"errors":null,"query":"\n query meta($relativePath: String!) {\n meta(relativePath: $relativePath) {\n ... on Document {\n _sys {\n filename\n basename\n breadcrumbs\n path\n relativePath\n extension\n }\n id\n }\n ...MetaParts\n }\n}\n \n fragment MetaParts on Meta {\n __typename\n banner {\n __typename\n enabled\n content\n }\n header {\n __typename\n links {\n __typename\n text\n url\n style\n }\n }\n footer {\n __typename\n sections {\n __typename\n name\n links {\n __typename\n text\n url\n }\n }\n socials {\n __typename\n text\n url\n icon\n }\n }\n}\n ","variables":{"relativePath":"./meta.md"}}};
globalThis.tina_info = tina;
})();
Worst-case Optimal Join Algorithms · RelationalAI
Check out highlights of RelationalAI at Snowflake's Data Cloud Summit 2024!
Efficient join processing is one of the most fundamental and well-studied tasks
in database research. In this work, we examine algorithms for natural join
queries over many relations and describe a new algorithm to process these
queries optimally in terms of worst-case data complexity.
In Proceedings of the 31st ACM SIGMOD-SIGACT-SIGAI symposium on Principles of
Database Systems (PODS ‘12). (Best Paper Award).
Efficient join processing is one of the most fundamental and well-studied tasks
in database research. In this work, we examine algorithms for natural join
queries over many relations and describe a new algorithm to process these
queries optimally in terms of worst-case data complexity. Our result builds on
recent work by Atserias, Grohe, and Marx, who gave bounds on the size of a
natural join query in terms of the sizes of the individual relations in the body
of the query. These bounds, however, are not constructive: they rely on
Shearer’s entropy inequality, which is information-theoretic. Thus, the previous
results leave open the question of whether there exist algorithms whose runtimes
achieve these optimal bounds. An answer to this question may be interesting to
database practice, as we show in this article that any project-join style plans,
such as ones typically employed in a relational database management system, are
asymptotically slower than the optimal for some queries. We present an algorithm
whose runtime is worst-case optimal for all natural join queries. Our result may
be of independent interest, as our algorithm also yields a constructive proof of
the general fractional cover bound by Atserias, Grohe, and Marx without using
Shearer’s inequality. This bound implies two famous inequalities in geometry:
the Loomis-Whitney inequality and its generalization, the Bollobás-Thomason
inequality. Hence, our results algorithmically prove these inequalities as well.
Finally, we discuss how our algorithm can be used to evaluate full conjunctive
queries optimally, to compute a relaxed notion of joins and to optimally (in the
worst-case) enumerate all induced copies of a fixed subgraph inside of a given
large graph.