Core Concepts#
RelationalAI (RAI) adds a layer of intelligence to your Snowflake data cloud, using Python-based models to define the relationships, rules, and reasoning that drive your organization’s decision-making process.
This guide covers the core concepts of the RAI Python package. To follow along, you’ll need:
- A Snowflake account with the RAI Native App installed.
- A Snowflake user that has been granted the
app_user
application role. - The
relationalai
Python package, either installed locally or in a cloud notebook.
Table of Contents#
- What is a Model?
- Creating a Model
- Declaring Types
- Populating Types With Objects
- Setting Object Properties
- Executing Queries
- Capturing Knowledge in Rules
- Summary and Next Steps
What is a Model?#
Models represent the entities important to your organization as objects. Each object represents a real-world entity, such as a customer, account, or transaction, and it has properties that describe its attributes and relationships to other objects.
Models describe objects by defining:
- Types, which represent categories of objects, such as
Customer
orAccount
. - Rules, which capture knowledge by setting object properties and assigning objects to types.
For instance, a rule might assign all
Account
objects that have no transactions within the last year to anInactive
type.
Rules may leverage advanced features, like graph analytics, to infer relationships and properties of objects and drive more sophisticated decision-making. A rule in a financial fraud model, for example, might use a graph algorithm to identify suspicious accounts that need to be investigated.
You can query a model to answer questions about the objects it describes. Queries are evaluated by the RAI Native App installed in your Snowflake account, where objects are created and assigned types and properties based on the model’s rules, using data from Snowflake tables shared with the app.
The following diagram illustrates the relationship between a model, the RAI Native App, and Snowflake data for a hypothetical financial fraud model:
Rules and queries are written in Python using RAI’s declarative query-builder syntax. In the following sections, you’ll learn the basics of creating a model, defining types and rules, and executing queries.
Creating a Model#
Instantiate the Model
class to create a model:
#import relationalai as rai
model = rai.Model("MyModel")
Models connect to your Snowflake account using the active profile in your raiconfig.toml
file.
However, you may create a model without a config file — for example, to use RAI in a cloud notebook environment — by providing connection details to the Model
constructor.
See Local Installation and Cloud Notebooks for details.
You’ll use the model
object to declare types, define rules, and execute queries.
Declaring Types#
Use the Model.Type()
method to declare a type:
#import relationalai as rai
model = rai.Model("MyModel")
Person = model.Type("Person")
model.Type("Person")
returns an instance of the Type
class, which is used to define objects in rules and queries.
By convention, variable names for types are capitalized.
Populating Types With Objects#
When you query a model, objects are created and assigned types and properties according to the model’s rules. Data for objects may come from Snowflake tables or be defined in rules.
Defining Objects From Rows In Snowflake Tables#
To populate a type with objects from a Snowflake table, pass the fully-qualified table name to the Type
constructor’s source
parameter:
#import relationalai as rai
model = rai.Model("MyModel")
# <db> and <schema> are placeholders for a Snowflake database and schema
# containing a table named "people".
Person = model.Type("Person", source="<db>.<schema>.people")
To use a Snowflake table as a source, the table must first be streamed into the RAI Native App by a Snowflake user with the app_admin
application role.
Both tables and views can be used as sources.
See Create a Stream for details.
Setting source="<db>.<schema>.people"
ensures that whenever the model is queried objects that correspond to rows in the <db>.<schema>.people
table are created and assigned to the Person
type.
Columns in the source table are used to set properties on each object, with lowercased column names used as property names.
Only columns with the following Snowflake data types are supported in source tables:
Defining Objects in Rules#
You may specify objects directly in your model’s rules using the Type.add()
method.
Rules are defined using model.rule()
, which returns a context manager and is used in a with
statement.
The following creates a model with a Person
type and defines two Person
objects in a rule:
#import relationalai as rai
# Create a model named "MyModel".
model = rai.Model("MyModel")
# Declare a Person type.
Person = model.Type("Person")
# Use Person.add() to define two Person objects.
with model.rule():
Person.add(name="Alice", age=10, favorite_color="blue")
Person.add(name="Bob", age=20)
Person.add()
declares an object of type Person
with properties specified as keyword arguments.
Objects in the same type may have different properties.
For instance, the above rule defines one Person
object with a favorite_color
property and another without.
.add()
does not create objects when it is called.
Instead, it specifies objects and properties that are created when the model is queried.
Properties passed to .add()
are hashed to create a unique internal identifier for each object.
Calling .add()
twice with the same properties and values does not generate duplicate objects.
For example, the following rule defines two distinct objects, not three:
#with model.rule():
# Define a Person object with three properties: name, age, and favorite_color.
Person.add(name="Bob", age=20, favorite_color="green")
# Define a second Person object with two properties: name and age. This object
# is distinct from the first object because its hash doesn't include the
# favorite_color property.
Person.add(name="Bob", age=20)
# The following does not define a third object because the properties hash
# to the same value as above.
Person.add(name="Bob", age=20)
The name of the type that .add()
is called on is included in the object’s hash.
This means that adding an object with the same properties to two different types results in two distinct objects being created:
#Person = model.Type("Person")
Student = model.Type("Student")
with model.rule():
# Define a Person object with name and age properties.
Person.add(name="Alice", age=10)
# Define a Student object with name and age properties. Even though the name
# and age properties are the same as the Person object above, the objects are
# distinct because they are created in different types.
Student.add(name="Alice", age=10)
You may, however, assign the same object to multiple types by passing additional types as positional arguments to .add()
.
For instance, the following rule defines one Person
object that is also a Student
:
#with model.rule():
Person.add(Student, name="Alice", age=10)
As a general rule, call .add()
from the most general type an object belongs to, and pass any subtypes it belongs to as positional arguments.
However, you are free to model things as you see fit.
In the Capturing Knowledge in Rules section, you’ll learn how to assign types to objects based on their properties and relationships to other objects. But first, let’s take a closer look at properties.
Setting Object Properties#
Objects may have two types of properties:
- Single-valued properties are assigned a single value.
- Multi-valued properties may be assigned multiple values.
Think of properties as arrows that connect objects to a value. They may point to values with the following types:
- Strings
- Numbers, such as integers and floats
- Booleans
- Dates and datetimes
- Other objects
Multi-valued properties do not point to a single list or set of values. Instead, they point to multiple values simultaneously.
The following diagram illustrates three objects and their properties. Single-valued properties are displayed as solid arrows and multi-valued properties as dashed arrows:
The status of a property as single- or multi-valued is fixed across the entire model.
There is only one name
property in above diagram, and it is single-valued.
You may set the name
property on any object, but it must always point to a single value.
Similarly, there is only one pets
property, which may also be set on any object but is always multi-valued.
By convention, we use plural names for multi-valued properties to distinguish them from single-valued properties.
Single-Valued Properties#
You can declare single-valued properties using the Property.declare()
method:
#import relationalai as rai
# Create a model named "MyModel".
model = rai.Model("MyModel")
# Declare Cat, Dog, and Person types.
Cat = model.Type("Cat")
Dog = model.Type("Dog")
Person = model.Type("Person")
# Declare the properties used by each type. This is optional.
Cat.name.declare()
Dog.name.declare()
Person.name.declare()
Person.age.declare()
# Define a Cat, Dog, and Person objects.
with model.rule():
# Note that calling .add() automatically creates single-valued properties
# even if their declarations are missing.
Cat.add(name="Whiskers")
Dog.add(name="Fido")
Person.add(name="Bob")
In this example:
-
Cat.name
returns an instance of theProperty
class. This might look a bit strange, since the.name
attribute doesn’t exist.Type
instances allow dynamic attribute access. -
.declare()
is called fromCat.name
,Dog.name
, andPerson.name
to declare thatCat
,Dog
, andPerson
objects all use the single-valuedname
property.
You aren’t required to set values set for every declared property.
For instance, the Person
object defined in the rule above doesn’t have an age
value.
.declare()
is optional.
Properties are declared implicitly when you set them in a rule.
Properties set with .add()
serve as the object’s primary key.
When you call .add()
, it returns an Instance
that references the object.
Use the Instance.set()
method to define additional single-valued properties that aren’t part of the object’s primary key:
#with model.rule():
# Define a Cat object with primary key property id set to 1. Note that if a
# Cat object with id=1 already exists, .add() returns a reference to the
# existing object.
cat = Cat.add(id=1)
# Set the name properties for the object. name is single-valued and is not
# part of the object's primary key.
cat.set(name="Whiskers")
# .set() also returns an Instance object, so you can chain calls. The
# following is equivalent to the above:
Cat.add(id=1).set(name="Whiskers")
You can only set a single-valued property once per object:
## The following pair of rules is invalid.
with model.rule():
Cat.add(id=1).set(name="Whiskers")
with model.rule():
# Cat(id=1) returns an Instance that references the Cat object with id=1.
# This is the same object defined in the previous rule. Setting two values
# for the same single-valued property is invalid.
Cat(id=1).set(name="Fluffy")
It’s tempting to interpret the rules in the preceding example as first creating a Cat
object with a name
property set to "Whiskers"
and then updating the name
property to "Fluffy"
.
But that’s not what happens.
Rules are not executed sequentially. Each rule describes a fact about the objects that are created when you query the model. The two rules above describe contradictory facts for the same object, which is invalid.
You may, however, set different properties for the same object in different rules:
## Define a Cat object with id=1 and set its name property to "Whiskers".
with model.rule():
Cat.add(id=1).set(name="Whiskers")
# Set the breed and color properties for the Cat object with id=1.
with model.rule():
Cat(id=1).set(breed="Siamese", color="white")
Contradictory rules may cause undefined behavior and should be avoided. Exceptions may be raised when invalid rules are detected, but this is not guaranteed. Use the RAI debugger to help identify and resolve issues with your model.
Multi-Valued Properties#
Declare multi-valued properties using the Property.has_many()
method:
#import relationalai as rai
# Create a model named "MyModel".
model = rai.Model("MyModel")
# Declare Cat, Dog, and Person types.
Cat = model.Type("Cat")
Dog = model.Type("Dog")
Person = model.Type("Person")
# Declare the properties used by each type. This is optional.
Cat.name.declare()
Dog.name.declare()
Person.name.declare()
Person.pets.has_many() # pets is multi-valued.
# Define Cat, Dog, and Person objects.
with model.rule():
whiskers = Cat.add(name="Whiskers")
fido = Dog.add(name="Fido")
bob = Person.add(name="Bob")
# Extend Bob's multi-valued pets property with Whiskers and Fido. Note that
# calling bob.pets.extend() automatically creates the multi-valued pets property
# even if its declaration is missing.
bob.pets.extend([fido, whiskers])
# Alternatively, you may add values to a multi-valued property one at a time
# using bob.pets.add(). The following two lines are equivalent to the one above.
# Like .extend(), .add() automatically creates the multi-valued pets property
# if its declaration is missing.
bob.pets.add(fido)
bob.pets.add(whiskers)
In this example:
-
Person.pets.has_many()
declares thatPerson
objects use a multi-valuedpets
property. -
bob.pets
returns anInstanceProperty
that references thebob
object’spets
property. Usebob.pets.extend()
orbob.pets.add()
to set values for thepets
property.
Just like single-valued properties, you aren’t required to set values for every multi-valued property that you declare.
.has_many()
is optional in most cases.
Multi-valued properties are declared implicitly when you set them in a rule.
The only exception is when you define objects from Snowflake tables. By default, properties created from columns in the source table are single-valued. If you intend to use a column as a multi-valued property, you must declare it as such.
Unlike single-valued properties, you may set multi-valued properties multiple times. Each rule that sets a multi-valued property adds to the property’s values:
## Define a Person object with two pets.
with model.rule():
bob = Person.add(name="Bob")
bob.pets.extend([
Cat.add(name="Whiskers"),
Dog.add(name="Fido"),
])
# Define another Dog object and set it as Bob's pet.
with model.rule():
# Person(name="Bob") returns an Instance that references the Person object
# with name="Bob".
bob = Person(name="Bob")
# Add another Dog object to Bob's pets. Here, bob.pets.add() is used instead
# of bob.pets.extend() since only one pet is being added.
bob.pets.add(Dog.add(name="Buddy"))
The second rule doesn’t replace the pets that are set in the first rule.
Rather, Bob’s pets
property points to three objects: Whiskers, Fido, and Buddy.
Multi-valued properties have set-like semantics. Setting the same value for a multi-valued property multiple times doesn’t create duplicate property values.
Executing Queries#
Queries are written using the model.query()
method,
which, like model.rule()
, returns a context manager that is used in a with
statement.
This section introduces the basics of querying a model. For a more in-depth look at RAI’s query-builder syntax, see the Basic Functionality guide.
Selecting Object Properties#
The following example creates a model with a Person
type, defines some Person
objects, and then queries the model for their IDs, names, and favorite colors:
#import relationalai as rai
# =====
# SETUP
# =====
# Create a model named "MyModel".
model = rai.Model("MyModel")
# Declare a Person type.
Person = model.Type("Person")
with model.rule():
# Define Person objects.
alice = Person.add(id=1).set(name="Alice", age=16, favorite_color="blue")
bob = Person.add(id=2).set(name="Bob", age=18, favorite_color="green")
carol = Person.add(id=3).set(name="Carol", age=18) # Carol has no favorite_color.
# Connect people to their friends with a multi-valued friends property.
# Visually, the friends property looks like: Alice <-> Bob <-> Carol.
alice.friends.add(bob)
bob.friends.extend([alice, carol])
carol.friends.add(bob)
# =======
# EXAMPLE
# =======
# Query the model for names of people and their favorite color.
with model.query() as select:
person = Person()
response = select(person.id, person.name, person.favorite_color)
# Print the query results.
print(response.results)
# id name favorite_color
# 0 1 Alice blue
# 1 2 Bob green
# 2 3 Carol NaN
Let’s break down the model.query()
block:
- Calling
Person()
returns anInstance
that references aPerson
object, which is assigned here to theperson
variable. select(person.id, person.name, person.favorite_color)
selects theid
,name
, andfavorite_color
properties ofperson
objects and returns aContext
object used to access the query results.- When the Python interpreter reaches the end of the
with
block, the model compiles the query together with all of its types and rules. The compiled query is sent to the RAI Native App for evaluation and execution is blocked until a response is received. Results are assigned to theContext
object’s.results
attribute and, by default, are returned as a pandas DataFrame with three columns labeledid
,name
, andfavorite_color
and a row for each person.
If an object lacks a value for a selected property, null values are returned.
For example, Carol has no favorite color, so it’s displayed as NaN
in the results.
Refer to Dealing With Null Values for more information on handling null values in queries.
Queries may be configured to return Snowpark DataFrames instead of pandas DataFrames. See Changing the Query Result Format for details.
Queries work a bit like a SQL SELECT
statement, but with a different order:
RAI Python | SQL Interpretation |
---|---|
person = Person() | FROM Person person |
select(person.name, person.favorite_color) | SELECT person.name, person.favorite_color |
Are you a SQL developer? Check out the RAI for SQL Users guide for a comparison of RAI’s query-builder syntax with SQL.
When you select a single-valued property, like person.name
, one row is returned for each person
object filtered by the query.
This can lead to duplicate rows in the results if multiple objects have the same property value.
For instance, querying the model for people’s ages returns two rows where the age is 18
:
#with model.query() as select:
person = Person()
response = select(person.age)
print(response.results)
# age
# 0 16
# 1 18
# 2 18
One of the duplicate rows is for Bob, and the other is for Carol.
Use select.distinct()
to remove duplicate rows:
#with model.query() as select:
person = Person()
response = select.distinct(person.age)
print(response.results)
# age
# 0 16
# 1 18
Selecting a multi-valued property, like person.friends
, returns a row for each person
and each of their friends, which can again lead to duplicate rows in the results:
#with model.query() as select:
person = Person()
response = select(person.friends)
print(response.results)
# friends
# 0 XXolLCtOngI6pFJ128Xktg
# 1 d1SmRsWF5TLVmYhCCPhD9g
# 2 g4rDjPY1HHWkEikWQXw+3Q
# 3 g4rDjPY1HHWkEikWQXw+3Q
person.friends
points to objects, so the friends
column in the results displays the internal identifier of each friend
object.
There are two rows with the same identifier.
This makes sense because Bob is friends with both Alice and Carol, so we should expect two rows corresponding to the Bob object.
Property access may be chained.
For example, person.friends.name
accesses the name
property of each person.friends
object.
Selecting person.friends.name
does not return a row per person per friend.
Instead, it returns a row per unique friend object in the set of all person.friends
objects:
#with model.query() as select:
person = Person()
response = select(person.friends.name)
print(response.results)
# name
# 0 Alice
# 1 Bob <-- Only one row for Bob, even though he's friends with two people.
# 2 Carol
In general, whatever is to the left of the last dot (.
) in a property chain determines the property’s key.
Since person.friends
is keyed by person
, selecting it returns rows for each person
object.
person.friends.name
is keyed by friends
, so selecting it returns rows for each unique object assigned to some person’s friends
property.
When you select multiple properties, multiple keys may be used to determine the rows in the results:
#with model.query() as select:
person = Person()
friend = person.friends
response = select(person.name, friend.name)
print(response.results)
# name name2
# 0 Alice Bob
# 1 Bob Alice
# 2 Bob Carol
# 3 Carol Bob
Each row in the results is keyed by both person
and friends
, so there is one row for each person
-friend
pair.
Filtering Objects#
Declare conditions in a query to filter objects:
#with model.query() as select:
person = Person()
person.age >= 18
response = select(person.name, person.age)
print(response.results)
# name age
# 0 Bob 18
# 1 Carol 18
In typical Python, person1.age >= 18
returns a Boolean value.
But with RAI, it returns an Expression
object, the truth value of which isn’t determined until the query is compiled and processed by the RAI Native App.
As a result, keywords like if
, and operators like and
and or
, are forbidden in queries.
See A Note About Logical Operators for more information.
The preceding query selects only people who are at least 18 years old.
person.age >= 18
is similar to a SQL WHERE
clause:
RAI Python | SQL Interpretation |
---|---|
person = Person() | FROM Person person |
person.age >= 18 | WHERE person.age >= 18 |
select(person.name, person.age) | SELECT person.name, person.age |
Queries can perform joins and declare multiple conditions:
#with model.query() as select:
person1, person2 = Person(), Person()
person1.age >= 16
person2.favorite_color.in_(["red", "blue"])
response = select(person1.name, person2.name)
print(response.results)
# name name2
# 0 Alice Alice
# 1 Bob Alice
# 2 Carol Alice
Each line in the query body is combined with AND
, so this query selects pairs of people where the first person is at least 16 and the second’s favorite color is red or blue.
See Filtering Objects by Type and Filtering Objects by Property Value in the Basic Functionality guide for more information on filtering objects.
Changing the Query Result Format#
Queries return a pandas DataFrame by default.
This downloads all of the results and, for large result sets, may be slow and consume a lot of memory.
To avoid this, you can set the query’s format
parameter to "snowpark"
to return a Snowpark DataFrame instead:
#with model.query(format="snowpark") as select:
person = Person()
response = select(person.name, person.favorite_color)
response.results.show()
# ---------------------------
# |"NAME" |"FAVORITE_COLOR"|
# ---------------------------
# |Alice |blue |
# |Bob |green |
# |Carol |null |
# ---------------------------
Alternatively, set format="snowpark"
when you instantiate the model to change the default result format:
#model = rai.Model("MyModel", format="snowpark")
Data for Snowpark DataFrames are stored in Snowflake. Only a small portion of the data is downloaded for display. You can use DataFrame methods to manipulate the data and even save the results to a table in a Snowflake database. See Writing Results to Snowflake for more information.
Capturing Knowledge in Rules#
Rules let you express knowledge about objects and their relationships.
For example, a rule can capture the fact “All people who are 18 years or older are adults” by setting Person
objects to an Adult
type if they meet the age requirement:
#import relationalai as rai
# =====
# SETUP
# =====
# Create a model named "MyModel".
model = rai.Model("MyModel")
# Declare Person and Adult types.
Person = model.Type("Person")
Adult = model.Type("Adult")
# Define Person objects.
with model.rule():
alice = Person.add(id=1).set(name="Alice", age=16, favorite_color="blue")
bob = Person.add(id=2).set(name="Bob", age=18, favorite_color="green")
carol = Person.add(id=3).set(name="Carol", age=18)
alice.friends.add(bob)
bob.friends.extend([alice, carol])
carol.friends.add(bob)
# =======
# EXAMPLE
# =======
# Define a rule that sets Person objects to the Adult type if they are 18 or older.
with model.rule():
person = Person()
person.age >= 18
person.set(Adult)
# Query the model for the names and ages of adults.
with model.query() as select:
adult = Adult()
response = select(adult.name, adult.age)
print(response.results)
# name age
# 0 Bob 18
# 1 Carol 18
Rules work like queries, except that instead of selecting things, they add new objects or set types and properties of existing objects.
Using the SQL analogy, you can interpret the above rule as something like the following:
RAI Python | SQL Interpretation |
---|---|
person = Person() | FROM Person person |
person.age >= 18 | WHERE person.age >= 18 |
person.set(Adult) | INSERT INTO Adult VALUES (person) |
Methods like Type.add()
and Instance.set()
act on the objects filtered by the rule.
But rules don’t have to stop after just one action.
They may continue to filter objects and take additional actions:
#with model.rule():
person = Person()
person.age >= 18
person.set(Adult)
person.friends.favorite_color == "blue"
person.set(has_blue_friend=True)
with model.query() as select:
adult = Adult()
response = select(adult.name, adult.age, adult.has_blue_friend)
print(response.results)
# name age has_blue_friend
# 0 Bob 18 True
# 1 Carol 18 NaN
In this version of the rule:
- First,
person.age >= 18
filters people who are 18 or older. The people who pass this filter are set to theAdult
type. - Next,
person.friends.favorite_color == "blue"
filters the remaining people who have a friend whose favorite color is blue. Thehas_blue_friend
property is set toTrue
for these people.
Filters and actions are applied in the declared order, with actions affecting only objects that pass all preceding filters.
For example, both Bob and Carol are adults, but only Bob’s has_blue_friend
property is set to True
.
Summary and Next Steps#
In this guide, you learned about the core concepts of the RelationalAI Python package:
- Models describe the entities, concepts, and logic important to your organization. Entities are represented by objects with properties that may be single- or multi-valued.
- Types represent categories of objects on a model and may be populated with objects from Snowflake tables or defined in rules.
- Rules represent facts about objects, such as which types they belong to and what properties they have.
- Queries ask questions about the objects described in a model.
The examples in this guide only scratch the surface of what you can express in a model. Check out the following resources to learn more:
- Basic Functionality: Learn how to write more sophisticated rules and queries.
- RAI for SQL Users: A comparison of RAI’s query-builder syntax with SQL.