Community Detection using Louvain
Louvain
is an algorithm that can quickly find non-overlapping communities within large networks. It is a hierarchical clustering algorithm, that recursively merges communities into a single node and executes the modularity clustering on the condensed graphs.
Our graph will model a social network in which people have friends and their friendship can be of different strength. We will first apply the algorithm to an unweighted undirected graph, then add friendship strength as weights to the edges and look at the difference in the results.
First, we start by importing the relationalai
library and define our model, which we call MyLouvainGraph
. We also create two types called Person
and Friendship
.
%pip install relationalai
import relationalai as rai
from relationalai.std import alias
from relationalai.std.graphs import Graph
model = rai.Model("MyLouvainGraph")
# Graph will hold these types:
# Represents nodes
Person = model.Type("Person")
# Represents connections between nodes
Friendship = model.Type("Friendship")
Let's add some data to our model
We first create a dictionary of people names, who are friends with each other, and a number representing the strength of their friendship. Then we iterate over the dictionary to create objects of type Person
with a name
property. We also create Friendship
objects with strength
property, representing connection between two people. We will use it to create weighted edges between nodes in our graph.
data = {
("Mary", "Bob"): 1.0,
("Mary", "Jessy"): 0.5,
("Mary", "Erica"): 0.1,
("Mary", "Tom"): 0.1,
("Bob", "Frank"): 0.1,
("Jessy", "Erica"): 1.0,
("Erica", "Tom"): 1.0,
("Tom", "Jessy"): 1.0,
("Frank", "Anna"): 1.0,
("Frank", "Mark"): 1.0,
("Frank", "Peter"): 0.5,
("Anna", "Peter"): 0.1,
("Anna", "Mark"): 1.0,
("Peter", "Mark"): 0.1,
}
with model.rule(dynamic = True):
for ((name, friend_name), strength) in data.items():
person = Person.add(name = name)
friend = Person.add(name = friend_name)
Friendship.add(person = person, friend = friend).set(strength = strength)
Creating the graph
Let's start by creating a graph
with Node
and Edge
types. We add all Person
instances as nodes, and assign the name
property so that we can use them in our queries and for visualization purposes. The Friendship
instances are then used to form the edges in our graph.
# Create graph
graph = Graph(model, undirected = True)
Node, Edge = graph.Node, graph.Edge
# add all Person instances as Nodes and assign `name` property (for displaying)
Node.extend(Person, name = Person.name)
# add all Friendship instances as Edges from `person` to `friend` Nodes
with model.rule():
f = Friendship()
Edge.add(f.person, f.friend)
Running the algorithm
Let's add a rule that calculates for every node of the graph the community that it belongs_to
. To derive this value for each node using the Louvain algorithm, we can simply use Graph(model).compute.louvain()
.
# run the Louvain algorithm on the Graph,
# for every Person node assign the community it belongs to
with model.rule():
p = Person()
community = graph.compute.louvain(p)
p.set(belongs_to = community)
Node(p).set(community_id = community)
Querying the Graph
Graph Nodes, Edges and their properties are queried using model.query()
context and graph.Node()
or graph.Edge()
types.
Alternatively, the entire graph representation can be fetched using Graph(model).fetch()
(not recommended to be used for large graphs). This returns a dictionary with two keys, nodes
and edges
, that represents the entire graph.
# query all Edges with `name` and `community_id` properties of the Nodes they connect
with model.query() as select:
e = Edge()
n1 = Node(e.from_)
n2 = Node(e.to)
response = select(alias(n1.name,"from"),alias(n1.community_id,"community (from)"), alias(n2.name,"to"),alias(n2.community_id,"community (to)"))
response
from | community (from) | to | community (to) |
---|---|---|---|
Anna | 1 | Mark | 1 |
Anna | 1 | Peter | 1 |
Bob | 2 | Frank | 1 |
Erica | 2 | Tom | 2 |
Frank | 1 | Anna | 1 |
Frank | 1 | Mark | 1 |
Frank | 1 | Peter | 1 |
Jessy | 2 | Erica | 2 |
Mary | 2 | Bob | 2 |
Mary | 2 | Erica | 2 |
Mary | 2 | Jessy | 2 |
Mary | 2 | Tom | 2 |
Peter | 1 | Mark | 1 |
Tom | 2 | Jessy | 2 |
How many communities were found? And to which community do people belong to?
# For each Person, fetch the name and the community it belongs to
with model.query() as select:
p = Person()
response = select(p.name, p.belongs_to)
response
name | belongs_to |
---|---|
Anna | 1 |
Bob | 2 |
Erica | 2 |
Frank | 1 |
Jessy | 2 |
Mark | 1 |
Mary | 2 |
Peter | 1 |
Tom | 2 |
# Group the response from the above query by the belongs_to property to separate the communities
communities = response.results.groupby("belongs_to").name.apply(list)
# Print the number of communities
count_communities = len(communities)
print(f"Number of communities: {count_communities}")
# Print communities and people that belong to them
for i, c in enumerate(communities):
print(f"Community N{i+1}: {c}")
Number of communities: 2 Community N1: ['Anna', 'Frank', 'Mark', 'Peter'] Community N2: ['Bob', 'Erica', 'Jessy', 'Mary', 'Tom']
Let's find out who is in the same community as Frank
person_name = "Frank"
with model.query() as select:
p1 = Person(name = person_name)
p2 = Person()
# We want to skip the person itself
p1 != p2
p1.belongs_to == p2.belongs_to
response = select(p2.name)
response
name |
---|
Anna |
Mark |
Peter |
Visualizing the results
Let's visualize our graph, to better understand the results. We use Graph(model).visualize()
to visualize our graph. We define a color for each of the communities.
graph.visualize(three = False, style = {
"node": {
"color": lambda n : ['blue', 'pink'][ list(communities.keys()).index(n['community_id']) % 2 ],
"label": lambda n : n['name'],
"hover": lambda n: n['community_id'],
"size": 30
},
"edge": {
"color": "green"
}
}).display(inline = True)
Adding weights to the graph edges
As a final step, let's see how adding weights to the graph edges affects the result of the algorithm.
Let's create a new graph with the same Nodes and Edges, but now using Friendship.strength
property as a weight of an edge.
# Create weighted graph
weighted_graph = Graph(model, undirected = True, weighted = True)
# add all same Person instances as Nodes and assign `name` property
weighted_graph.Node.extend(Person, name = Person.name)
# add all Friendship instances as Edges from `person` to `friend` Nodes with weight property
with model.rule():
f = Friendship()
weighted_graph.Edge.add(f.person, f.friend, weight = f.strength)
# run Louvain algorithm on the new Graph
with model.rule():
n = weighted_graph.Node()
n.set(community_id = weighted_graph.compute.louvain(n))
How many communities were found now?
# query weighted graph nodes names and communities they were assigned to
with model.query() as select:
n = weighted_graph.Node()
response = select(n.name, n.community_id)
# group names by community_id
communities = response.results.groupby("community_id").name.apply(list)
# print communities and names of people that belong to them
for i, c in enumerate(communities):
print(f"Community N{i+1}: {c}")
Community N1: ['Anna', 'Frank', 'Mark', 'Peter'] Community N2: ['Erica', 'Jessy', 'Tom'] Community N3: ['Bob', 'Mary']
Visualizing the weighted graph
Let's make the size of the Edges proportional to the weight. Additionally, we'll define more colors since we now have more communities.
weighted_graph.visualize(three = False, style = {
"node": {
"color": lambda n : ['yellow', 'orange', 'blue', 'pink'][ list(communities.keys()).index(n['community_id']) % 4 ],
"label": lambda n : n['name'],
"hover": lambda n: n['community_id'],
"size": 30
},
"edge": {
"color": "green",
"size": lambda e: e['weight'] * 3.5
}
}).display(inline = True)