data_stream_batches#
relationalai.api
#data_stream_batches
A view containing data streams batch processing information, such as the stream’s current synchronization status, remaining number of batches, and errors encountered during processing.
Requires the cdc_admin
application role.
Columns#
Column Name | Data Type | Description |
---|---|---|
ID | STRING | The unique batch identifier. |
DATA_STREAM_ID | STRING | The data stream’s unique identifier. |
RAI_DATABASE | STRING | The name of the RAI Python model for which the stream was created. |
RAI_RELATION | STRING | The name of the stream as passed to the stream_name parameter of the create_data_stream() procedure. |
FQ_OBJECT_NAME | STRING | The fully-qualified name of the stream’s source table or view, e.g. '<db>.<schema>.<table_or_view>' . |
COLUMNS | ARRAY | An array of JSON objects containing column information for the batch. |
BATCH_DETAILS | OBJECT | A JSON object containing details about the batch, such as batch start and end times and the number of rows processed. |
UNLOADED | TIMESTAMP | The timestamp when the batch was unloaded. |
STATUS | STRING | The current status of the batch. May be one of:
|
TRANSACTION_ID | STRING | The transaction ID of the batch process on the CDC Engine. Use the api.transactions view to get transaction details. |
PROCESSING_DETAILS | OBJECT | A JSON object containing details about batch processing, including any errors encountered. |
LAST_UPDATE | TIMESTAMP | The timestamp of the last batch update from the CDC Service. |
The fields in the JSON array for COLUMNS
are as follows:
Field Name | Data Type | Description |
---|---|---|
column | NUMBER | The column number. |
name | STRING | The column name. |
type | STRING | The column data type. |
numericPrecision | NUMBER | The column’s numeric precision. NULL for non-numeric columns. |
numericScale | NUMBER | The column’s numeric scale. NULL for non-numeric columns. |
default | The column’s default value. NULL if no default is set. |
The fields in the JSON object for BATCH_DETAILS
are as follows:
Field Name | Data Type | Description |
---|---|---|
writeChangesStart | TIMESTAMP | The timestamp when the batch write changes started. |
writeChangesEnd | TIMESTAMP | The timestamp when the batch write changes ended. |
writeChangesDuration | NUMBER | The duration of the batch write changes in milliseconds. |
rows | NUMBER | The number of rows processed in the batch. |
size | NUMBER | The size of the batch in bytes. |
The fields in the JSON object for PROCESSING_DETAILS
are as follows:
Field Name | Data Type | Description |
---|---|---|
transactionStart | TIMESTAMP | The timestamp when the CDC engine began processing the transaction. |
transactionEnd | TIMESTAMP | The timestamp when the CDC engine finished processing the transaction. |
transactionDuration | NUMBER | The duration of the transaction processing in milliseconds. |
processingErrors | ARRAY | An array of error messages, if any, encountered during batch processing. |
failedTransactions | ARRAY | An array of transaction IDs that failed during batch processing. |
retries | NUMBER | The number of retries attempted during batch processing. |
Example#
Use the api.data_stream_batches
view to retrieve data stream batch processing information:
#SELECT * FROM relationalai.api.data_stream_batches;
/*+----------------------------------- +-------------------------------------+--------------+---------------------------+---------------------------+----------------------------------+--------------------+-------------------------+-------------+-------------------------+------------------------------------------------------------------+-------------------------+
| ID | DATA_STREAM_ID | RAI_DATABASE | RAI_RELATION | FQ_OBJECT_NAME | COLUMNS | BATCH_DETAILS | UNLOADED | STATUS | TRANSACTION_ID | PROCESSING_DETAILS | LAST_UPDATE |
|------------------------------------+-------------------------------------+--------------+---------------------------+---------------------------+----------------------------------+--------------------+-------------------------+-------------+-------------------------+------------------------------------------------------------------+-------------------------|
| dsb_1234abcd_5678_ef90_1234_abcd56 | ds_f9a8d7c2_ae1f_431b_912c_c0e9a123 | MyModel | example_db.public.table1 | example_db.public.table1 | [{"name": "col1", ...}, ...] | {"rows": 150, ...} | 2024-10-24 09:16:30.123 | LOADED | 03bc5678-1234-5678-90ab | {"processingErrors": [], ...} | 2024-10-24 09:16:30.123 |
| dsb_7890efgh_1234_abcd_5678_efgh12 | ds_a1b2c3d4_e5f6_7a89_b123_d456e789 | SalesModel | example_db.sales.view1 | example_db.sales.view1 | [{"name": "colA", ...}, ...] | {"rows": 200, ...} | 2024-10-23 12:25:45.250 | PROCESSING | 02de4567-8901-2345-6789 | {"processingErrors": [], ...} | 2024-10-23 12:25:45.250 |
| dsb_abcd5678_ef90_1234_5678_ghij90 | ds_8e7f6d5c_4a3b_2c1d_0e9f_7b6a8d9f | HRModel | example_db.hr.employees | example_db.hr.employees | [{"name": "emp_id", ...}, ...] | {"rows": 50, ...} | 2024-10-22 15:39:10.580 | QUARANTINED | 01ef3456-7890-abcd-ef01 | {"processingErrors": ["Error processing data stream", ...], ...} | 2024-10-22 15:39:10.580 |
| dsb_ef901234_abcd_5678_efgh_1234kl | ds_9a8b7c6d_5e4f_3d2a_1b0e_f7g6h5i3 | FinanceModel | example_db.finance.budget | example_db.finance.budget | [{"name": "amount", ...}, ...] | {"rows": 120, ...} | 2024-10-21 17:46:10.300 | PENDING | 03ab4567-ef90-1234-5678 | {"processingErrors": [], ...} | 2024-10-21 17:46:10.300 |
+----------------------------------- +-------------------------------------+--------------+---------------------------+---------------------------+----------------------------------+--------------------+-------------------------+-------------+-------------------------+------------------------------------------------------------------+-------------------------+ */
Use the api.data_streams
view to get information about the data streams for each batch.
See Data Management for more information about data streams.