Core Concepts#

RelationalAI (RAI) adds a layer of intelligence to your Snowflake data cloud, using Python-based models to define the relationships, rules, and reasoning that drive your organization’s decision-making process.

This guide covers the core concepts of the RAI Python package. To follow along, you’ll need:

Table of Contents#

What is a Model?#

Models represent the entities important to your organization as objects. Each object represents a real-world entity, such as a customer, account, or transaction, and it has properties that describe its attributes and relationships to other objects.

Models describe objects by defining:

Rules may leverage advanced features, like graph analytics, to infer relationships and properties of objects and drive more sophisticated decision-making. A rule in a financial fraud model, for example, might use a graph algorithm to identify suspicious accounts that need to be investigated.

You can query a model to answer questions about the objects it describes. Queries are evaluated by the RAI Native App installed in your Snowflake account, where objects are created and assigned types and properties based on the model’s rules, using data from Snowflake tables shared with the app.

The following diagram illustrates the relationship between a model, the RAI Native App, and Snowflake data for a hypothetical financial fraud model:

Diagram illustrating a model's interaction with Snowflake data. The model has four types: Criminal (data source: criminals), Account (data source: accounts), Transaction (data source: transactions), and Suspicious. It also has one rule: Accounts with transaction patterns similar to accounts owned by criminals are suspicious. The query is: "Which accounts are suspicious?" The RAI Native App processes the query using the Snowflake data sources (financedb.fraud.criminals, financedb.fraud.accounts, financedb.fraud.transactions). The results are returned to the model's Python process.

Rules and queries are written in Python using RAI’s declarative query-builder syntax. In the following sections, you’ll learn the basics of creating a model, defining types and rules, and executing queries.

Creating a Model#

Instantiate the Model class to create a model:

#import relationalai as rai

model = rai.Model("MyModel")
IMPORTANT

Models connect to your Snowflake account using the active profile in your raiconfig.toml file. However, you may create a model without a config file — for example, to use RAI in a cloud notebook environment — by providing connection details to the Model constructor. See Local Installation and Cloud Notebooks for details.

You’ll use the model object to declare types, define rules, and execute queries.

Declaring Types#

Use the Model.Type() method to declare a type:

#import relationalai as rai

model = rai.Model("MyModel")
Person = model.Type("Person")

model.Type("Person") returns an instance of the Type class, which is used to define objects in rules and queries. By convention, variable names for types are capitalized.

Populating Types With Objects#

When you query a model, objects are created and assigned types and properties according to the model’s rules. Data for objects may come from Snowflake tables or be defined in rules.

Defining Objects From Rows In Snowflake Tables#

To populate a type with objects from a Snowflake table or view, create a data stream from the table to your model and pass the fully-qualified table name to the Type constructor’s source parameter:

#import relationalai as rai

model = rai.Model("MyModel")

# Get a Provider instance.
app = rai.Provider()

# Create a stream from a Snowflake table named "people" to your model. Note that
# <db> and <schema> are placeholders for a Snowflake database and schema.
app.create_streams(["<db>.<schema>.people"], model="MyModel")

# Declare a Person type that is populated with objects from the "people" table.
Person = model.Type("Person", source="<db>.<schema>.people")
IMPORTANT

To create a data stream, you must have SELECT privileges on the source table and the cdc_admin application role. See Data Management for more information on data streams.

Setting source="<db>.<schema>.people" ensures that whenever the model is queried objects that correspond to rows in the <db>.<schema>.people table are created and assigned to the Person type. Columns in the source table are used to set properties on each object, with lowercased column names used as property names.

Only columns with the following Snowflake data types are supported in source tables:

NOTE

If you need to stream data from a source object with unsupported column types, consider creating a view without those columns or that casts the unsupported columns to a supported type.

Defining Objects in Rules#

You may specify objects directly in your model’s rules using the Type.add() method. Rules are defined using model.rule(), which returns a context manager and is used in a with statement.

The following creates a model with a Person type and defines two Person objects in a rule:

#import relationalai as rai

# Create a model named "MyModel".
model = rai.Model("MyModel")

# Declare a Person type.
Person = model.Type("Person")

# Use Person.add() to define two Person objects.
with model.rule():
    Person.add(name="Alice", age=10, favorite_color="blue")
    Person.add(name="Bob", age=20)

Person.add() declares an object of type Person with properties specified as keyword arguments. Objects in the same type may have different properties. For instance, the above rule defines one Person object with a favorite_color property and another without.

IMPORTANT

.add() does not create objects when it is called. Instead, it specifies objects and properties that are created when the model is queried.

Properties passed to .add() are hashed to create a unique internal identifier for each object. Calling .add() twice with the same properties and values does not generate duplicate objects. For example, the following rule defines two distinct objects, not three:

#with model.rule():
    # Define a Person object with three properties: name, age, and favorite_color.
    Person.add(name="Bob", age=20, favorite_color="green")

    # Define a second Person object with two properties: name and age. This object
    # is distinct from the first object because its hash doesn't include the
    # favorite_color property.
    Person.add(name="Bob", age=20)

    # The following does not define a third object because the properties hash
    # to the same value as above.
    Person.add(name="Bob", age=20)

The name of the type that .add() is called on is included in the object’s hash. This means that adding an object with the same properties to two different types results in two distinct objects being created:

#Person = model.Type("Person")
Student = model.Type("Student")

with model.rule():
    # Define a Person object with name and age properties.
    Person.add(name="Alice", age=10)

    # Define a Student object with name and age properties. Even though the name
    # and age properties are the same as the Person object above, the objects are
    # distinct because they are created in different types.
    Student.add(name="Alice", age=10)

You may, however, assign the same object to multiple types by passing additional types as positional arguments to .add(). For instance, the following rule defines one Person object that is also a Student:

#with model.rule():
    Person.add(Student, name="Alice", age=10)

As a general rule, call .add() from the most general type an object belongs to, and pass any subtypes it belongs to as positional arguments. However, you are free to model things as you see fit.

In the Capturing Knowledge in Rules section, you’ll learn how to assign types to objects based on their properties and relationships to other objects. But first, let’s take a closer look at properties.

Setting Object Properties#

Objects may have two types of properties:

Think of properties as arrows that connect objects to a value. They may point to values with the following types:

Multi-valued properties do not point to a single list or set of values. Instead, they point to multiple values simultaneously.

The following diagram illustrates three objects and their properties. Single-valued properties are displayed as solid arrows and multi-valued properties as dashed arrows:

A hierarchical diagram of objects and properties. A single-valued "name" property points from the a Person object to the string "Bob". The Person object has a multi-valued "pets" property (dashed arrows) pointing to two objects: a Dog named "Fido" and a Cat named "Whiskers." Each pet object has a single-valued "name" property pointing to its name.

The status of a property as single- or multi-valued is fixed across the entire model.

There is only one name property in above diagram, and it is single-valued. You may set the name property on any object, but it must always point to a single value. Similarly, there is only one pets property, which may also be set on any object but is always multi-valued.

By convention, we use plural names for multi-valued properties to distinguish them from single-valued properties.

Single-Valued Properties#

You can declare single-valued properties using the Property.declare() method:

#import relationalai as rai

# Create a model named "MyModel".
model = rai.Model("MyModel")

# Declare Cat, Dog, and Person types.
Cat = model.Type("Cat")
Dog = model.Type("Dog")
Person = model.Type("Person")

# Declare the properties used by each type. This is optional.
Cat.name.declare()
Dog.name.declare()
Person.name.declare()
Person.age.declare()

# Define a Cat, Dog, and Person objects.
with model.rule():
    # Note that calling .add() automatically creates single-valued properties
    # even if their declarations are missing.
    Cat.add(name="Whiskers")
    Dog.add(name="Fido")
    Person.add(name="Bob")

In this example:

You aren’t required to set values set for every declared property. For instance, the Person object defined in the rule above doesn’t have an age value.

NOTE

.declare() is optional. Properties are declared implicitly when you set them in a rule.

Properties set with .add() serve as the object’s primary key. When you call .add(), it returns an Instance that references the object. Use the Instance.set() method to define additional single-valued properties that aren’t part of the object’s primary key:

#with model.rule():
    # Define a Cat object with primary key property id set to 1. Note that if a
    # Cat object with id=1 already exists, .add() returns a reference to the
    # existing object.
    cat = Cat.add(id=1)

    # Set the name properties for the object. name is single-valued and is not
    # part of the object's primary key.
    cat.set(name="Whiskers")

    # .set() also returns an Instance object, so you can chain calls. The
    # following is equivalent to the above:
    Cat.add(id=1).set(name="Whiskers")

You can only set a single-valued property once per object:

## The following pair of rules is invalid.

with model.rule():
    Cat.add(id=1).set(name="Whiskers")

with model.rule():
    # Cat(id=1) returns an Instance that references the Cat object with id=1.
    # This is the same object defined in the previous rule. Setting two values
    # for the same single-valued property is invalid.
    Cat(id=1).set(name="Fluffy")

It’s tempting to interpret the rules in the preceding example as first creating a Cat object with a name property set to "Whiskers" and then updating the name property to "Fluffy". But that’s not what happens.

Rules are not executed sequentially. Each rule describes a fact about the objects that are created when you query the model. The two rules above describe contradictory facts for the same object, which is invalid.

You may, however, set different properties for the same object in different rules:

## Define a Cat object with id=1 and set its name property to "Whiskers".
with model.rule():
    Cat.add(id=1).set(name="Whiskers")

# Set the breed and color properties for the Cat object with id=1.
with model.rule():
    Cat(id=1).set(breed="Siamese", color="white")
IMPORTANT

Contradictory rules may cause undefined behavior and should be avoided. Exceptions may be raised when invalid rules are detected, but this is not guaranteed. Use the RAI debugger to help identify and resolve issues with your model.

Multi-Valued Properties#

Declare multi-valued properties using the Property.has_many() method:

#import relationalai as rai

# Create a model named "MyModel".
model = rai.Model("MyModel")

# Declare Cat, Dog, and Person types.
Cat = model.Type("Cat")
Dog = model.Type("Dog")
Person = model.Type("Person")

# Declare the properties used by each type. This is optional.
Cat.name.declare()
Dog.name.declare()
Person.name.declare()
Person.pets.has_many()  # pets is multi-valued.

# Define Cat, Dog, and Person objects.
with model.rule():
    whiskers = Cat.add(name="Whiskers")
    fido = Dog.add(name="Fido")
    bob = Person.add(name="Bob")

    # Extend Bob's multi-valued pets property with Whiskers and Fido. Note that
    # calling bob.pets.extend() automatically creates the multi-valued pets property
    # even if its declaration is missing.
    bob.pets.extend([fido, whiskers])

    # Alternatively, you may add values to a multi-valued property one at a time
    # using bob.pets.add(). The following two lines are equivalent to the one above.
    # Like .extend(), .add() automatically creates the multi-valued pets property
    # if its declaration is missing.
    bob.pets.add(fido)
    bob.pets.add(whiskers)

In this example:

Just like single-valued properties, you aren’t required to set values for every multi-valued property that you declare.

NOTE

.has_many() is optional in most cases. Multi-valued properties are declared implicitly when you set them in a rule.

The only exception is when you define objects from Snowflake tables. By default, properties created from columns in the source table are single-valued. If you intend to use a column as a multi-valued property, you must declare it as such.

Unlike single-valued properties, you may set multi-valued properties multiple times. Each rule that sets a multi-valued property adds to the property’s values:

## Define a Person object with two pets.
with model.rule():
    bob = Person.add(name="Bob")
    bob.pets.extend([
        Cat.add(name="Whiskers"),
        Dog.add(name="Fido"),
    ])

# Define another Dog object and set it as Bob's pet.
with model.rule():
    # Person(name="Bob") returns an Instance that references the Person object
    # with name="Bob".
    bob = Person(name="Bob")

    # Add another Dog object to Bob's pets. Here, bob.pets.add() is used instead
    # of bob.pets.extend() since only one pet is being added.
    bob.pets.add(Dog.add(name="Buddy"))

The second rule doesn’t replace the pets that are set in the first rule. Rather, Bob’s pets property points to three objects: Whiskers, Fido, and Buddy.

Multi-valued properties have set-like semantics. Setting the same value for a multi-valued property multiple times doesn’t create duplicate property values.

Executing Queries#

Queries are written using the model.query() method, which, like model.rule(), returns a context manager that is used in a with statement.

This section introduces the basics of querying a model. For a more in-depth look at RAI’s query-builder syntax, see the Basic Functionality guide.

Selecting Object Properties#

The following example creates a model with a Person type, defines some Person objects, and then queries the model for their IDs, names, and favorite colors:

#import relationalai as rai


# =====
# SETUP
# =====

# Create a model named "MyModel".
model = rai.Model("MyModel")

# Declare a Person type.
Person = model.Type("Person")

with model.rule():
    # Define Person objects.
    alice = Person.add(id=1).set(name="Alice", age=16, favorite_color="blue")
    bob = Person.add(id=2).set(name="Bob", age=18, favorite_color="green")
    carol = Person.add(id=3).set(name="Carol", age=18)  # Carol has no favorite_color.

    # Connect people to their friends with a multi-valued friends property.
    # Visually, the friends property looks like: Alice <-> Bob <-> Carol.
    alice.friends.add(bob)
    bob.friends.extend([alice, carol])
    carol.friends.add(bob)


# =======
# EXAMPLE
# =======

# Query the model for names of people and their favorite color.
with model.query() as select:
    person = Person()
    response = select(person.id, person.name, person.favorite_color)

# Print the query results.
print(response.results)
#    id   name favorite_color
# 0   1  Alice           blue
# 1   2    Bob          green
# 2   3  Carol            NaN

Let’s break down the model.query() block:

If an object lacks a value for a selected property, null values are returned. For example, Carol has no favorite color, so it’s displayed as NaN in the results. Refer to Dealing With Null Values for more information on handling null values in queries.

NOTE

Queries may be configured to return Snowpark DataFrames instead of pandas DataFrames. See Changing the Query Result Format for details.

Queries work a bit like a SQL SELECT statement, but with a different order:

RAI PythonSQL Interpretation
person = Person()FROM Person person
select(person.name, person.favorite_color)SELECT person.name, person.favorite_color
INFO

Are you a SQL developer? Check out our RAI Python to SQL comparison guide.

When you select a single-valued property, like person.name, one row is returned for each person object filtered by the query. This can lead to duplicate rows in the results if multiple objects have the same property value. For instance, querying the model for people’s ages returns two rows where the age is 18:

#with model.query() as select:
    person = Person()
    response = select(person.age)

print(response.results)
#    age
# 0   16
# 1   18
# 2   18

One of the duplicate rows is for Bob, and the other is for Carol. Use select.distinct() to remove duplicate rows:

#with model.query() as select:
    person = Person()
    response = select.distinct(person.age)

print(response.results)
#    age
# 0   16
# 1   18

Selecting a multi-valued property, like person.friends, returns a row for each person and each of their friends, which can again lead to duplicate rows in the results:

#with model.query() as select:
    person = Person()
    response = select(person.friends)

print(response.results)
#                   friends
# 0  XXolLCtOngI6pFJ128Xktg
# 1  d1SmRsWF5TLVmYhCCPhD9g
# 2  g4rDjPY1HHWkEikWQXw+3Q
# 3  g4rDjPY1HHWkEikWQXw+3Q

person.friends points to objects, so the friends column in the results displays the internal identifier of each friend object. There are two rows with the same identifier. This makes sense because Bob is friends with both Alice and Carol, so we should expect two rows corresponding to the Bob object.

Property access may be chained. For example, person.friends.name accesses the name property of each person.friends object. Selecting person.friends.name does not return a row per person per friend. Instead, it returns a row per unique friend object in the set of all person.friends objects:

#with model.query() as select:
    person = Person()
    response = select(person.friends.name)

print(response.results)
#     name
# 0  Alice
# 1    Bob  <-- Only one row for Bob, even though he's friends with two people.
# 2  Carol

In general, whatever is to the left of the last dot (.) in a property chain determines the property’s key. Since person.friends is keyed by person, selecting it returns rows for each person object. person.friends.name is keyed by friends, so selecting it returns rows for each unique object assigned to some person’s friends property.

When you select multiple properties, multiple keys may be used to determine the rows in the results:

#with model.query() as select:
    person = Person()
    friend = person.friends
    response = select(person.name, friend.name)

print(response.results)
#     name  name2
# 0  Alice    Bob
# 1    Bob  Alice
# 2    Bob  Carol
# 3  Carol    Bob

Each row in the results is keyed by both person and friends, so there is one row for each person-friend pair.

Filtering Objects#

Declare conditions in a query to filter objects:

#with model.query() as select:
    person = Person()
    person.age >= 18
    response = select(person.name, person.age)

print(response.results)
#     name  age
# 0    Bob   18
# 1  Carol   18
IMPORTANT

In typical Python, person1.age >= 18 returns a Boolean value. But with RAI, it returns an Expression object, the truth value of which isn’t determined until the query is compiled and processed by the RAI Native App.

As a result, keywords like if, and operators like and and or, are forbidden in queries. See A Note About Logical Operators for more information.

The preceding query selects only people who are at least 18 years old. person.age >= 18 is similar to a SQL WHERE clause:

RAI PythonSQL Interpretation
person = Person()FROM Person person
person.age >= 18WHERE person.age >= 18
select(person.name, person.age)SELECT person.name, person.age

Queries can perform joins and declare multiple conditions:

#with model.query() as select:
    person1, person2 = Person(), Person()
    person1.age >= 16
    person2.favorite_color.in_(["red", "blue"])
    response = select(person1.name, person2.name)

print(response.results)
#     name  name2
# 0  Alice  Alice
# 1    Bob  Alice
# 2  Carol  Alice

Each line in the query body is combined with AND, so this query selects pairs of people where the first person is at least 16 and the second’s favorite color is red or blue. See Filtering Objects by Type and Filtering Objects by Property Value in the Basic Functionality guide for more information on filtering objects.

Changing the Query Result Format#

Queries return a pandas DataFrame by default. This downloads all of the results and, for large result sets, may be slow and consume a lot of memory. To avoid this, you can set the query’s format parameter to "snowpark" to return a Snowpark DataFrame instead:

#with model.query(format="snowpark") as select:
    person = Person()
    response = select(person.name, person.favorite_color)

response.results.show()
# ---------------------------
# |"NAME"  |"FAVORITE_COLOR"|
# ---------------------------
# |Alice   |blue            |
# |Bob     |green           |
# |Carol   |null            |
# ---------------------------

Alternatively, set format="snowpark" when you instantiate the model to change the default result format:

#model = rai.Model("MyModel", format="snowpark")

Data for Snowpark DataFrames are stored in Snowflake. Only a small portion of the data is downloaded for display. You can use DataFrame methods to manipulate the data and even save the results to a table in a Snowflake database. See Writing Results to Snowflake for more information.

Capturing Knowledge in Rules#

Rules let you express knowledge about objects and their relationships. For example, a rule can capture the fact “All people who are 18 years or older are adults” by setting Person objects to an Adult type if they meet the age requirement:

#import relationalai as rai


# =====
# SETUP
# =====

# Create a model named "MyModel".
model = rai.Model("MyModel")

# Declare Person and Adult types.
Person = model.Type("Person")
Adult = model.Type("Adult")

# Define Person objects.
with model.rule():
    alice = Person.add(id=1).set(name="Alice", age=16, favorite_color="blue")
    bob = Person.add(id=2).set(name="Bob", age=18, favorite_color="green")
    carol = Person.add(id=3).set(name="Carol", age=18)

    alice.friends.add(bob)
    bob.friends.extend([alice, carol])
    carol.friends.add(bob)


# =======
# EXAMPLE
# =======

# Define a rule that sets Person objects to the Adult type if they are 18 or older.
with model.rule():
    person = Person()
    person.age >= 18
    person.set(Adult)

# Query the model for the names and ages of adults.
with model.query() as select:
    adult = Adult()
    response = select(adult.name, adult.age)

print(response.results)
#     name  age
# 0    Bob   18
# 1  Carol   18

Rules work like queries, except that instead of selecting things, they add new objects or set types and properties of existing objects.

Using the SQL analogy, you can interpret the above rule as something like the following:

RAI PythonSQL Interpretation
person = Person()FROM Person person
person.age >= 18WHERE person.age >= 18
person.set(Adult)INSERT INTO Adult VALUES (person)

Methods like Type.add() and Instance.set() act on the objects filtered by the rule. But rules don’t have to stop after just one action. They may continue to filter objects and take additional actions:

#with model.rule():
    person = Person()
    person.age >= 18
    person.set(Adult)
    person.friends.favorite_color == "blue"
    person.set(has_blue_friend=True)

with model.query() as select:
    adult = Adult()
    response = select(adult.name, adult.age, adult.has_blue_friend)

print(response.results)
#     name  age has_blue_friend
# 0    Bob   18            True
# 1  Carol   18             NaN

In this version of the rule:

Filters and actions are applied in the declared order, with actions affecting only objects that pass all preceding filters. For example, both Bob and Carol are adults, but only Bob’s has_blue_friend property is set to True.

Summary and Next Steps#

In this guide, you learned about the core concepts of the RelationalAI Python package:

The examples in this guide only scratch the surface of what you can express in a model. To learn more: