data_stream_batches#

relationalai.api
#data_stream_batches

A view containing data streams batch processing information, such as the stream’s current synchronization status, remaining number of batches, and errors encountered during processing. Requires the cdc_admin application role.

Columns#

Column NameData TypeDescription
IDSTRINGThe unique batch identifier.
DATA_STREAM_IDSTRINGThe data stream’s unique identifier.
RAI_DATABASESTRINGThe name of the RAI Python model for which the stream was created.
RAI_RELATIONSTRINGThe name of the stream as passed to the stream_name parameter of the create_data_stream() procedure.
FQ_OBJECT_NAMESTRINGThe fully-qualified name of the stream’s source table or view, e.g. '<db>.<schema>.<table_or_view>'.
COLUMNSARRAYAn array of JSON objects containing column information for the batch.
BATCH_DETAILSOBJECTA JSON object containing details about the batch, such as batch start and end times and the number of rows processed.
UNLOADEDTIMESTAMPThe timestamp when the batch was unloaded.
STATUSSTRINGThe current status of the batch. May be one of:
  • PENDING
  • PROCESSING
  • QUARANTINED
  • LOADED
See Quarantined Streams for details on what it means to be quarantined.
TRANSACTION_IDSTRINGThe transaction ID of the batch process on the CDC Engine. Use the api.transactions view to get transaction details.
PROCESSING_DETAILSOBJECTA JSON object containing details about batch processing, including any errors encountered.
LAST_UPDATETIMESTAMPThe timestamp of the last batch update from the CDC Service.

The fields in the JSON array for COLUMNS are as follows:

Field NameData TypeDescription
columnNUMBERThe column number.
nameSTRINGThe column name.
typeSTRINGThe column data type.
numericPrecisionNUMBERThe column’s numeric precision. NULL for non-numeric columns.
numericScaleNUMBERThe column’s numeric scale. NULL for non-numeric columns.
defaultThe column’s default value. NULL if no default is set.

The fields in the JSON object for BATCH_DETAILS are as follows:

Field NameData TypeDescription
writeChangesStartTIMESTAMPThe timestamp when the batch write changes started.
writeChangesEndTIMESTAMPThe timestamp when the batch write changes ended.
writeChangesDurationNUMBERThe duration of the batch write changes in milliseconds.
rowsNUMBERThe number of rows processed in the batch.
sizeNUMBERThe size of the batch in bytes.

The fields in the JSON object for PROCESSING_DETAILS are as follows:

Field NameData TypeDescription
transactionStartTIMESTAMPThe timestamp when the CDC engine began processing the transaction.
transactionEndTIMESTAMPThe timestamp when the CDC engine finished processing the transaction.
transactionDurationNUMBERThe duration of the transaction processing in milliseconds.
processingErrorsARRAYAn array of error messages, if any, encountered during batch processing.
failedTransactionsARRAYAn array of transaction IDs that failed during batch processing.
retriesNUMBERThe number of retries attempted during batch processing.

Example#

Use the api.data_stream_batches view to retrieve data stream batch processing information:

#SELECT * FROM relationalai.api.data_stream_batches;
/*+----------------------------------- +-------------------------------------+--------------+---------------------------+---------------------------+----------------------------------+--------------------+-------------------------+-------------+-------------------------+------------------------------------------------------------------+-------------------------+
  | ID                                 | DATA_STREAM_ID                      | RAI_DATABASE | RAI_RELATION              | FQ_OBJECT_NAME            | COLUMNS                          | BATCH_DETAILS      | UNLOADED                | STATUS      | TRANSACTION_ID          | PROCESSING_DETAILS                                               | LAST_UPDATE             |
  |------------------------------------+-------------------------------------+--------------+---------------------------+---------------------------+----------------------------------+--------------------+-------------------------+-------------+-------------------------+------------------------------------------------------------------+-------------------------|
  | dsb_1234abcd_5678_ef90_1234_abcd56 | ds_f9a8d7c2_ae1f_431b_912c_c0e9a123 | MyModel      | example_db.public.table1  | example_db.public.table1  | [{"name": "col1", ...}, ...]     | {"rows": 150, ...} | 2024-10-24 09:16:30.123 | LOADED      | 03bc5678-1234-5678-90ab | {"processingErrors": [], ...}                                    | 2024-10-24 09:16:30.123 |
  | dsb_7890efgh_1234_abcd_5678_efgh12 | ds_a1b2c3d4_e5f6_7a89_b123_d456e789 | SalesModel   | example_db.sales.view1    | example_db.sales.view1    | [{"name": "colA", ...}, ...]     | {"rows": 200, ...} | 2024-10-23 12:25:45.250 | PROCESSING  | 02de4567-8901-2345-6789 | {"processingErrors": [], ...}                                    | 2024-10-23 12:25:45.250 |
  | dsb_abcd5678_ef90_1234_5678_ghij90 | ds_8e7f6d5c_4a3b_2c1d_0e9f_7b6a8d9f | HRModel      | example_db.hr.employees   | example_db.hr.employees   | [{"name": "emp_id", ...}, ...]   | {"rows": 50, ...}  | 2024-10-22 15:39:10.580 | QUARANTINED | 01ef3456-7890-abcd-ef01 | {"processingErrors": ["Error processing data stream", ...], ...} | 2024-10-22 15:39:10.580 |
  | dsb_ef901234_abcd_5678_efgh_1234kl | ds_9a8b7c6d_5e4f_3d2a_1b0e_f7g6h5i3 | FinanceModel | example_db.finance.budget | example_db.finance.budget | [{"name": "amount", ...}, ...]   | {"rows": 120, ...} | 2024-10-21 17:46:10.300 | PENDING     | 03ab4567-ef90-1234-5678 | {"processingErrors": [], ...}                                    | 2024-10-21 17:46:10.300 |
  +----------------------------------- +-------------------------------------+--------------+---------------------------+---------------------------+----------------------------------+--------------------+-------------------------+-------------+-------------------------+------------------------------------------------------------------+-------------------------+ */

Use the api.data_streams view to get information about the data streams for each batch. See Data Management for more information about data streams.

See Also#