Getting Started: Cloud Notebooks#

IMPORTANT

Before you can use relationalai, ensure that your Snowflake account administrator has:

  1. Installed the RelationalAI Native App from the Snowflake Marketplace.
  2. Granted your Snowflake user the app_user application role.

See the Native App Installation guide for details.

You can use RelationalAI in cloud notebook environments like Snowflake, Google Colab or Hex. Click on the tabs below to see the instructions for each cloud notebook environment.

Install RelationalAI#

Follow these steps to install the RelationalAI (RAI) Python package for use in a Snowflake notebook environment:

  1. Choose Notebook settings from the triple-dots (⋮) dropdown in the top-right corner of the Snowflake Notebooks window and switch to the External access tab. Turn on the S3_RAI_INTERNAL_BUCKET_EGRESS_INTEGRATION toggle. This allows your notebook to access data from the RAI Native App.
  2. Upload the RAI Python library as a ZIP file into your notebook filesystem. You can download the ZIP file here.
  3. Run the following code to make the packages in the ZIP file visible to the Python interpreter:
#import sys
sys.path.append("./relationalai.zip")

Test Your Installation#

Verify that the relationalai package is installed and working correctly by running the following code:

#import relationalai as rai

# Create a model named "MyFirstModel".
model = rai.Model("MyFirstModel")

# Send a test query.
with model.query() as select:
    response = select(1 + 1)

# The results are stored as a pandas DataFrame in the response.results attribute.
print(response.results)
# Output:
#    v
# 0  2

Build a Model#

RAI models are built on top of data in your Snowflake account and provide a suite of tools for advanced reasoning and analytics. Some examples of the things you can do with a model include:

To get a taste of what you can do with RAI, let’s create a model of customers, products, and purchases, and use a graph algorithm to segment customers into groups with similar purchase behavior.

Create a Sample Dataset and Share It With Your Model#

Execute the following to create a sample dataset of purchases in your Snowflake account and share it with your model:

#provider = rai.Provider()

# Create a table of purchases.
provider.sql("""
begin
    create database if not exists RAI_DEMO;
    create schema if not exists RAI_DEMO.PUBLIC;

    create or replace table RAI_DEMO.PUBLIC.PURCHASES (
        id int,
        customer_id int,
        product_id int,
        date date
    );

    insert into RAI_DEMO.PUBLIC.PURCHASES(id, customer_id, product_id, date) values
    (1, 1001, 2001, '2023-08-01'),
    (2, 1001, 2002, '2023-08-03'),
    (3, 1001, 2003, '2023-08-05'),
    (4, 1002, 2001, '2023-08-02'),
    (5, 1002, 2003, '2023-08-06'),
    (6, 1003, 2002, '2023-08-04'),
    (7, 1003, 2004, '2023-08-07'),
    (8, 1003, 2005, '2023-08-09'),
    (9, 1004, 2003, '2023-08-10'),
    (10, 1004, 2005, '2023-08-12'),
    (11, 1005, 2001, '2023-08-15'),
    (12, 1005, 2004, '2023-08-17'),
    (13, 1006, 2002, '2023-08-20'),
    (14, 1006, 2005, '2023-08-22'),
    (15, 1007, 2003, '2023-08-25'),
    (16, 1007, 2004, '2023-08-27'),
    (17, 1008, 2002, '2023-08-30'),
    (18, 1008, 2005, '2023-09-01'),
    (19, 1009, 2001, '2023-09-02'),
    (20, 1009, 2003, '2023-09-05');
end;
""")

# Share the RAI_DEMO.PUBLIC.PURCHASES table with the model.
provider.create_streams(["RAI_DEMO.PUBLIC.PURCHASES"], "MyFirstModel")

Define Types and Rules#

In RAI, entities like purchases, customers, and products are represented as objects with properties that describe their attributes and relationships. Objects may be defined from rows in a Snowflake table, or they can be defined in rules.

The following code defines Purchase, Customer, and Product types. Purchase objects correspond to rows in the RAI_DEMO.PUBLIC.PURCHASES table, while Customer and Product objects are defined with a rule:

## Define the Purchase type with objects sourced from the rows of the
# RAI_DEMO.PUBLIC.PURCHASES table. Columns in the table are automatically
# mapped to properties of Purchase objects.
Purchase = model.Type("Purchase", source="RAI_DEMO.PUBLIC.PURCHASES")

# Define Customer and Product types. These types have no source, so their objects
# must be specified in rules.
Customer = model.Type("Customer")
Product = model.Type("Product")

# Define a rule that describes Customer and Product objects with id properties based
# on Purchase objects' customer_id and product_id properties. This rule also defines
# a products_purchased property to connect customers to products they've purchased.
with model.rule():
    purchase = Purchase()
    customer = Customer.add(id=purchase.customer_id)
    product = Product.add(id=purchase.product_id)
    customer.products_purchased.extend([product])

Query Your Model#

With types and rules defined, you can now query your model. The following query gets the IDs of products purchased by the customer with ID 1001:

#with model.query() as select:
    customer = Customer(id=1001)
    response = select(customer.products_purchased.id)

print(response.results)
#      id
# 0  2001
# 1  2002
# 2  2003

When you run this query, the model compiles the query along with all of its types and rules and sends the compiled query to the RAI Native App running in your Snowflake account. The native app generates objects according to the model’s rules, evaluates the query, and returns the results to your Python environment.

Add Advanced Analytics#

Let’s define a graph-based rule to segment customers based on their purchase behavior.

First, create a graph using Customer objects as nodes and edges between customers who have purchased the same product. Then, use the Louvain algorithm to segment customers into groups with similar purchases:

#from relationalai.std.graphs import Graph

# Create an undirected graph.
customer_graph = Graph(model, undirected=True)

# Set Customer objects as nodes in the graph.
customer_graph.Node.extend(Customer)

# Define edges between customers who have purchased the same product.
with model.rule():
    customer1, customer2 = Customer(), Customer()
    customer1.products_purchased == customer2.products_purchased
    customer_graph.Edge.add(customer1, customer2)

# Segment customers using the Louvain algorithm.
with model.rule():
    customer = Customer()
    customer.set(segment=customer_graph.compute.louvain(customer))

# Query the model for customer IDs and their segments.
with model.query() as select:
    customer = Customer()
    response = select(customer.id, customer.segment)

print(response.results)
#      id  segment
# 0  1001        2
# 1  1002        1
# 2  1003        2
# 3  1004        2
# 4  1005        1
# 5  1006        2
# 6  1007        1
# 7  1008        2
# 8  1009        1

# What products were purchased by customers in each segment?
with model.query() as select:
    customer = Customer()
    response = select.distinct(customer.segment, customer.products_purchased.id)

print(response.results)
#    segment    id
# 0        1  2001
# 1        1  2003
# 2        1  2004
# 3        2  2001
# 4        2  2002
# 5        2  2003
# 6        2  2004
# 7        2  2005

Our model found two customer segments with distinct purchase behavior. Customers in segment 1 seem to prefer a specific subset of products. Customers in segment 2 purchase a wider variety of products, with some overlap with segment 1.

Armed with this knowledge, you could craft targeted marketing campaigns, or recommend products to customers who haven’t yet purchased a popular product in their segment.

Next Steps#

To learn more about using RAI: