Infomap
The infomap
function returns a community
, to which a node belongs, which is determined using Infomap algorithm.
In the analysis of social networks, for instance, it can often be useful to identify subgroups of connected people. These subgroups can be detected by using an algorithm such as Infomap.
Let us look at a very small example social graph and use Infomap to identify which people belong to which subgroups.
Our graph will model a social network in which people are connected based on common hobbies. We will first apply the algorithm to unweighted undirected graph, then add the number of common hobbies as weights to the edges and look at the difference in the results.
First, we need to import the relationalai
library and define our model, which we call MyInfomapGraph
. We also create two types called Person
and HobbyLink
.
%pip install relationalai
import relationalai as rai
from relationalai.std import alias
from relationalai.std.graphs import Graph
model = rai.Model("MyInfomapGraph")
# Model will hold these types:
# Represent nodes
Person = model.Type("Person")
# Represent connections between nodes
HobbyLink = model.Type("HobbyLink")
Let's add some data to our model
We first create a dictionary of people names, who have common hobbies, and how many hobbies they share. Then we iterate over the dictionary to create objects of type Person
with a name
property. We also create HobbyLink
objects with common_count
property, representing connection between two people. We will use it to create weighted edges between nodes in our graph.
data = {
("Mary", "Bob"): 5,
("Mary", "Jessy"): 2,
("Mary", "Erica"): 1,
("Mary", "Tom"): 1,
("Mary", "Chris"): 1,
("Bob", "Chris"): 3,
("Bob", "Frank"): 1,
("Jessy", "Erica"): 5,
("Erica", "Tom"): 5,
("Tom", "Jessy"): 5,
("Frank", "Anna"): 5,
("Frank", "Mark"): 5,
("Frank", "Peter"): 2,
("Anna", "Peter"): 1,
("Anna", "Mark"): 5,
("Peter", "Mark"): 1,
}
with model.rule(dynamic = True):
for (fellows, common_hobbies) in data.items():
name, fellow_name = fellows
person = Person.add(name = name)
fellow = Person.add(name = fellow_name)
HobbyLink.add(person = person, fellow = fellow).set(count_common = common_hobbies)
Creating the graph
Let's start by creating a graph
with Node
and Edge
collections. We add all Person
instances as nodes, and assign the name
property so that we can use them in our queries and for visualization purposes. The HobbyLink
instances are then used to form the edges
in our graph.
# Create graph
graph = Graph(model, undirected = True)
Node, Edge = graph.Node, graph.Edge
# add all Person instances as Nodes and assign `name` property (for displaying)
Node.extend(Person, name = Person.name)
# add all HobbyLink instances as Edges from `person` to `fellow` Nodes
with model.rule():
l = HobbyLink()
Edge.add(l.person, l.fellow)
Running the algorithm
Let's add a rule that calculates for every node of the graph the community that it belongs_to
. To derive this value for each node using the infomap algorithm, we can simply use Graph(model).compute.infomap()
.
# run infomap algorithm on the Graph,
# for every Person node assign the community it belongs to
with model.rule():
p = Person()
community = graph.compute.infomap(p)
p.set(belongs_to = community)
Node(p).set(community = community)
Querying the Graph
Graph Nodes, Edges and their properties are queried using model.query()
context and graph.Node()
or graph.Edge()
types.
Alternatively, the entire graph representation can be fetched using Graph(model).fetch()
. This returns a dictionary with two keys, nodes
and edges
, that represents the entire graph.
# query all Edges with `name` and `community` properties of the Nodes they connect
with model.query() as select:
e = Edge()
n1 = Node(e.from_)
n2 = Node(e.to)
response = select(alias(n1.name, 'from'), alias(n1.community, 'community (from)'), alias(n2.name, 'to'), alias(n2.community, 'community (to)'))
response
# fetch the entire graph dictionary (not recommended to be used for large graphs)
# graph_dict = graph.fetch()
from | community (from) | to | community (to) |
---|---|---|---|
Anna | 1 | Mark | 1 |
Anna | 1 | Peter | 1 |
Bob | 2 | Chris | 2 |
Bob | 2 | Frank | 1 |
Erica | 2 | Tom | 2 |
Frank | 1 | Anna | 1 |
Frank | 1 | Mark | 1 |
Frank | 1 | Peter | 1 |
Jessy | 2 | Erica | 2 |
Mary | 2 | Bob | 2 |
Mary | 2 | Chris | 2 |
Mary | 2 | Erica | 2 |
Mary | 2 | Jessy | 2 |
Mary | 2 | Tom | 2 |
Peter | 1 | Mark | 1 |
Tom | 2 | Jessy | 2 |
How many communities were found? And to which communities do people belong?
# For each Person, fetch the name and the community they belong to
with model.query() as select:
p = Person()
response = select(p.name, p.belongs_to)
response
name | belongs_to |
---|---|
Anna | 1 |
Bob | 2 |
Chris | 2 |
Erica | 2 |
Frank | 1 |
Jessy | 2 |
Mark | 1 |
Mary | 2 |
Peter | 1 |
Tom | 2 |
# Group the response from the above query by the belongs_to property to separate the communities
communities = response.results.groupby("belongs_to").name.apply(list)
# Print the number of communities
count_communities = len(communities)
print(f"Number of communities: {count_communities}")
# Print communities and people that belong to them
for i, c in enumerate(communities):
print(f"Community N{i+1}: {c}")
Number of communities: 2 Community N1: ['Anna', 'Frank', 'Mark', 'Peter'] Community N2: ['Bob', 'Chris', 'Erica', 'Jessy', 'Mary', 'Tom']
Let's find out who is in the same community as Anna
person_name = "Anna"
with model.query() as select:
p1 = Person(name = person_name)
p2 = Person()
# We want to skip the person itself
p1 != p2
p1.belongs_to == p2.belongs_to
response = select(p2.name)
response
name |
---|
Frank |
Mark |
Peter |
Visualizing the results
Let's visualize our graph, to better understand the results. We use Graph(model).visualize()
to visualize our graph. We define a color for each of the communities.
graph.visualize(three = False, style = {
"node": {
"color": lambda n : ['blue', 'pink'][list(communities.keys()).index(n['community']) % 2],
"label": lambda n : n['name'],
"hover": lambda n: n['community'],
"size": 30
},
"edge": {
"color": "green"
}
}).display(inline = True)
Adding weights to the graph edges
As a final step, let's see how adding weights to the graph edges affects the result of the algorithm.
Let's create a new graph with the same Nodes and Edges, but now using HobbyLink.count_common
property as a weight of an edge.
# Create weighted graph
weighted_graph = Graph(model, undirected = True, weighted = True)
# add all same Person instances as Nodes and assign `name` property
weighted_graph.Node.extend(Person, name = Person.name)
# add all HobbyLink instances as Edges from `person` to `fellow` Nodes with weight property
with model.rule():
l = HobbyLink()
weighted_graph.Edge.add(l.person, l.fellow, weight = l.count_common)
# run infomap algorithm on the new Graph
with model.rule():
n = weighted_graph.Node()
n.set(community = weighted_graph.compute.infomap(n))
How many communities were found now?
# query weighted graph nodes names and communities they were assigned to
with model.query() as select:
n = weighted_graph.Node()
response = select(n.name, n.community)
# group names by community_id
communities = response.results.groupby("community").name.apply(list)
# print communities and names of people that belong to them
for i, c in enumerate(communities):
print(f"Community N{i+1}: {c}")
Community N1: ['Anna', 'Frank', 'Mark', 'Peter'] Community N2: ['Erica', 'Jessy', 'Tom'] Community N3: ['Bob', 'Chris', 'Mary']
Visualizing the weighted graph
Let's make the size of the Edges proportional to the weight. Additionally, we'll define more colors since we now have more communities.
weighted_graph.visualize(three = False, style = {
"node": {
"color": lambda n : ['blue', 'pink', 'yellow'][list(communities.keys()).index(n['community']) % 3],
"label": lambda n : n['name'],
"hover": lambda n: n['community'],
"size": 30
},
"edge": {
"color": "green",
"size": lambda e: e['weight']
}
}).display(inline = True)