Understanding Our World

Max De Marzi

17 August 2022

6 min read

“Walking with a friend in the dark is better than walking alone in the light” is a saying attributed to Helen Keller… someone who spent her life in silent darkness as she was both deaf and blind. Anne Sullivan was her lifelong companion and Miracle Worker who taught her language, reading, and writing. Anne was the friend in the darkness we all hope to have, someone who can help us understand the world. Today we will continue to shine a light on the friendships of the Zachary Karate Club Graph to learn more about the Rel language.

Last time we found all the nodes reachable from any one node, found the connected components in our graph and calculated cosine and jaccard similarities.

Today we'll look at the nuances of the social relations of the members of the Karate Club. We will explore one simple and one complex structure in the graph.

Before we continue, I must warn you: You will stare at five lines of code for an awfully long time. It will melt your brain, as it did mine. I had to scrape it off the floor with an ice cream scoop…but once you understand this code, the hidden power of Rel will be revealed.

Finding Triangles

Let’s start off with triangles by finding three nodes that are all connected together. We can define this method as three sets of neighbor relations. We can also add a clause to force our triangles to be in node ID order and eliminate out of order “duplicates”:

def triangle(a, b, c) = neighbor(a, b) and neighbor(a, c) and neighbor(b, c) and a < b < c
def output = triangle

Output:

#Int64Int64Int64
1123
2124
3128
............
44313334
45323334

It found 45 triangle friendships between our members.

What if we want to see all the triangles that include the member represented by Node 3?

We have three elements, a, b and c, that are part of a triangle. Using the logical operation for true () we can make sure that Node 3 is one of them. The Rel code is simpler than the written version, see for yourself:

def output = a, b, c : triangle(a,b,c) and {a;b;c}(3)

Output:

#Int64Int64Int64
1123
2134
3138
4139
51314
6234
............
103414
113933

We can also turn this into a method “node triangle” by replacing the 3 with a placeholder x to be filled in later and get the same output.

def node_triangle(x, a, b, c) = triangle(a,b,c) and {a;b;c}(x)
def output = node_triangle[3]

Output:

#Int64Int64Int64
1123
2134
3138
4139
51314
6234
............
103414
113933

Demand Transformation

Now you may be wondering… are we finding the node triangles for all of the nodes in the graph first and then filtering for Node 3, or are we only finding the node triangles for Node 3?

This leads us into the topic of demand transformation, which is a fancy term for turning infinite calculations into finite calculations. If you squint your eyes you can kind of think of node_triangle as a database “view”.

Then the question becomes, should you materialize this view or not? To materialize a view means to compute and store the results. It isn’t a big deal for our small karate graph, but think about what happens when you have 3 billion nodes and 20 billion relationships.

Demand transformation is equivalent to materializing only the parts of the view we want, and not the entire thing.

RAI is able to figure out this answer for some queries today, more with every release, and hopefully all very soon. However if you want to force the system to only compute what you ask for now, you can add the @ondemand annotation to our method:

@ondemand
def node_triangle(x, a, b, c) = triangle(a,b,c) and {a;b;c}(x)

def output = node_triangle[3]

Output:

#Int64Int64Int64
1123
2134
3138
4139
51314
6234
............
103414
113933

These annotations are not a hint or suggestion, they are a command and will force the compiler to do what you ask.

We could also ask that a “view” is not created at all by using the @inline annotations we’ve seen before. This will simply copy and paste the code in the definition whenever it is used instead of actually building a definition.

As users, we typically don't want to worry about these kinds of details, but it is nice to know we have ways to tell the system what we want the compiler to do.

Cliques

Now how about Cliques? Do you remember high school or secondary school when you were a teenager? Where a bunch of small groups of kids with shared interests hung out together pretty much exclusively?

Think about the movie Mean Girls or The Craft. Think about Grease if you like musicals or Carrie if horror is more your thing. Those movies have great examples of cliques.

Let's find cliques with four members in our graph. We can define a “four clique” as four nodes that are all neighbors with each other:

def four_cliques(a, b, c, d) = neighbor(a, b) and neighbor(a, c) and neighbor(a, d) and neighbor(b, c) and neighbor(b, d) and neighbor(c, d) and a < b < c < d

def output = four_cliques

Output:

#Int64Int64Int64Int64
11234
21238
312314
41248
512414
61348
713414
82348
923414
109313334
1124303334

That definition is quite long. What if we reuse the triangles definition to shorten it a bit? We can say that a clique of four members starts as a triangle then adds a fourth member that is neighbors with all three members of the triangle like this:

def four_cliques(a, b, c, d) = triangles(a, b, c) and neighbor(a, d) and neighbor(b, d) and neighbor(c, d) and c < d

def output = four_cliques

We get the same output as above.

What about cliques with five members? Well, that's just a clique of four with an additional member who is friends with everyone in the group:

def five_cliques(a, b, c, d, e) = four_cliques(a, b, c, d) and neighbor(a, e) and neighbor(b, e) and neighbor(c, e) and neighbor(d, e) and d < e

def output = five_cliques

Output:

#Int64Int64Int64Int64Int64
112348
2123414

It’s starting to get a bit out of hand. What would be really nice is for the language to let us express what a clique is, and not have to add one level at a time.

Rel lets us do just that using entities.

Entities

Entities are not scary apparitions. They are a tool in the language that lets us define a “thing”.

So what exactly is this clique thing then? We can think of a clique as an entity in our graph where every node must have a relationship with all the other nodes in that clique.

We want to be able to build cliques of different sizes, so we will start at the very beginning with size zero. This is an empty clique, which we will allow by setting it to true.

Next, we need a way to add members to a clique. In order for this to be allowed, the new member must be connected to all the other members.

entity Clique nil = true
entity Clique add_to_clique(member, members) = connected(member, members)

Next, we need to define what it means to be "connected".

Let’s start with the simplest case. We will allow a new node to be added to an empty clique so we can grow the size of our clique from zero members to one member.

So let’s restrict this baseline definition to force the new member to be a node and members to be nil. Since there is only one node, there aren’t any nodes to be related to, so our definition is just:

def connected(new_member, members) = node(new_member) and nil(members)

With that out of the way, we will create another definition for connected when the clique is not empty.

First, the new member must be a neighbor of the last member to be added to the clique.

Second, the new member must have a node ID that is greater than the node ID of the last member (so we don’t get duplicate cliques in various orders).

Third, we need to perform this check recursively against the other members of this clique except the one we just checked.

So how do we do that exactly? We are going to request that the new member is connected to the other members. But where do we get the last_member and other_members from? We get them from add_to_clique in reverse.

def connected(new_member, members) = 
        neighbor(new_member, last_member) 
    and new_member > last_member
    and connected(new_member, other_members)
    and add_to_clique(last_member, other_members, members) from (last_member, other_members)

When your head stops spinning from that definition, you have to come to the realization that Rel is not a procedural language. It is a relational declarative language with roots in logic programming.

The definitions are rules, or statements, that can be true given the right conditions. So if members is a list of nodes in a clique, we can walk backwards through add_to_clique to extract the last_member and other_members.

From the clique, we know the last node added to it must be the one with the highest ID and the other members those nodes with IDs below it.

Let me show you a bit more to see if it clicks. How would we know if a node is a member of a clique? Well two ways.

First, it can be the last member added to the clique. We can define that method using add_to_clique again, with member as the first element, the underscore (which means we don’t care what goes here) as the second value, and members as our clique.

def clique_member(member, members) = add_to_clique(member, _, members)

But if it’s not the last member to be added to the clique, we need to check the other members in the clique.

How do we do that? Well, we need to get other members from the members using add_to_clique again, and then check to see if the first definition is true for our member and the other members:

def clique_member(member, members) = 
     add_to_clique(_, other_members, members) 
     and clique_member(member, other_members) from other_members

Is it starting to make sense now? No? Not yet? Okay, one more.

How do we find the size of a clique? If the clique is empty, then the size is equal to zero. That’s easy right?

def clique_size[members in Clique] = 0, nil(members)

But what if the clique is not empty ? The clique size is equal to one (for the current member) plus the clique size of the other members… and where do we get the other members from?

From going backwards through add_to_clique again and ignoring the current member which is the last one added:

def clique_size[members in Clique] = 1 + clique_size[other_members] from other_members where members = add_to_clique[_, other_members]

If you're wondering why we created clique_member and clique_size it's because we're going to use them next.

Let’s see the members of our cliques for any clique of size five we find in our graph. The first value in our output will be the hash ID of the clique and the second value will be one of the member node IDs:

def output = member : clique_member(member, members), clique_size(members, 5) for members

Output:

#Int64Int64
11009093504529760238743043315682239274821
21009093504529760238743043315682239274822
31009093504529760238743043315682239274823
41009093504529760238743043315682239274824
51009093504529760238743043315682239274828
62177553888083210319565460833826551101601
72177553888083210319565460833826551101602
82177553888083210319565460833826551101603
92177553888083210319565460833826551101604
1021775538880832103195654608338265511016014

Exactly the same as we had earlier in our definition of five_cliques. If we want to make the output pretty we can use table from the standard library.

What table does is take a relation that has three elements: the column name, the key, and the value and lays it out nicely. We can use enumerate to get a number as a column name for the position of the clique member.

def data[id, members] = enumerate[member : clique_member(member, members) ][id], clique_size(members, 5)

def output = table[data]

Output:

#12345
110090935045297602387430433156822392748212348
2217755388808321031956546083382655110160123414

I hope you enjoyed this preview into Rel and its features. You have learned about the power to add annotations to force your will upon the compiler and the ability to define entities.

Rel goes far beyond a simple type system by allowing you to define the rules your creations have to obey.  Finally, you learned that Rel definitions are not methods that turn input into output, but rather rules that can be evaluated in both directions.

Next time, we will follow the social paths of the Karate Club members using the Ford Fulkerson algorithm. Check out Part 1 and Part 2 if you haven't already, and subscribe to our newsletter to stay up to date with RelationalAI.

Related Posts

Get Early Access

Join our community, keep up to date with the latest developments in our monthly newsletter, and get early access to RelationalAI.