Ingesting CaliberMind Data Into Snowflake
June 27, 2024
CaliberMind data is stored in Google BigQuery but can be easily transferred to your Snowflake instance. This guide outlines two methods for this data transfer. Please note that both options may have cost implications with your CaliberMind contract or other vendors. We recommend consulting with your sales representative or account manager to discuss any potential costs.

OPTION 1 – Create a Snowflake Integration on CaliberMind’s Google Cloud Storage (Recommended)
This method involves setting up a direct connection between your Snowflake instance and CaliberMind’s Google Cloud Storage (GCS) bucket. Snowflake can use GCS buckets as an external stage, regardless of your Snowflake instance’s cloud provider (AWS, Azure, etc.). For a more detailed overview of this process, please refer to Snowflake’s help documentation.
Prerequisites:
- You must have the
ACCOUNTADMINrole in your Snowflake organization.
Step 1: Set Up CaliberMind Components in Snowflake
We recommend creating a unique role, schema, and warehouse for CaliberMind to ensure proper security and isolation. You may also want to create a separate database depending on your organization’s needs.
The following script will create a new role, user, warehouse, and database for CaliberMind and grant the necessary permissions.
Important: You must replace [Enter very secure password here] with a strong, unique password. We suggest using a password generator.
begin; -- Create variables for user, password, role, warehouse, and database set role_name = 'CALIBERMIND_ROLE'; set user_name = 'CALIBERMIND_USER'; set user_password = '[Enter very secure password here]'; set warehouse_name = 'CALIBERMIND_WAREHOUSE'; set database_name = 'CALIBERMIND_DATABASE'; /* This is optional */ -- Use the securityadmin role for user and role creation use role securityadmin; -- Create the CaliberMind role create role if not exists identifier($role_name); grant role identifier($role_name) to role SYSADMIN; -- Create a user for CaliberMind create user if not exists identifier($user_name) password = $user_password default_role = $role_name default_warehouse = $warehouse_name; grant role identifier($role_name) to user identifier($user_name); -- Use the sysadmin role for warehouse and database creation use role sysadmin; -- Create a warehouse for CaliberMind create warehouse if not exists identifier($warehouse_name) warehouse_size = xsmall warehouse_type = standard auto_suspend = 10 auto_resume = true initially_suspended = true; -- Create a database for Calibermind create database if not exists identifier($database_name); -- Grant the CaliberMind role access to the warehouse grant USAGE on warehouse identifier($warehouse_name) to role identifier($role_name); -- Grant the CaliberMind role access to the database grant CREATE SCHEMA, MONITOR, USAGE on database identifier($database_name) to role identifier($role_name); grant create stage on schema public to role ($role_name); commit;
Step2: Create a Storage Integration
Next, you’ll need to create a storage integration. CaliberMind will provide you with your GCS bucket URL, which you will use for the storage_allowed_locations field. If you do not have this URL, please contact your account manager.
use role ACCOUNTADMIN;
create storage integration calibermind_int
type = external_stage
storage_provider = gcs
enabled = true
storage_allowed_locations = ('gcs://[calibermind bucket]/[path to data]/');
grant usage on integration calibermind_int to role CALIBERMIND_ROLE;
Creating this integration generates a Google service account within your Snowflake instance. You will need to provide the service account name to CaliberMind so we can grant it access to your GCS bucket.
To find the service account name, run the following command:
desc storage integration calibermind_int;
Copy the property_value from the STORAGE_GCP_SERVICE_ACCOUNT row and send it to your account manager.
| property | property_type | property_value |
|---|---|---|
| ENABLED | Boolean | true |
| STORAGE_ALLOWED_LOCATIONS | List | gcs://[calibermind-project]/[data-folder] |
| STORAGE_GCP_SERVICE_ACCOUNT | String | service-account-id@[calibermind_project].iam.gserviceaccount.com |
Step 3: Configure the Stage and File Format
Once CaliberMind has granted the service account access, you need to set up a FILE_FORMAT and a STAGE in Snowflake. This tells Snowflake how to interpret the data files and represent them as a table.
create or replace file format calibermind_csv_gzip
type = csv
field_delimiter = ','
skip_header = 1
null_if = ('NULL', 'null','NaN')
empty_field_as_null = true
compression = gzip;
use schema calibermind_database.public;
create stage calibermind_gcp_stage
url = 'gcs://[calibermind project]/[data folder]'
storage_integration = calibermind_int;
file_format = calibermind_csv_gzip;
Step 4: Refresh Your Data
Your connection is now set up. The final step is to load the data into Snowflake. We recommend scheduling a nightly job to perform a full table refresh. Please note that CaliberMind currently only supports full data refreshes, not incremental loads.
set table_name = cm_attribution; alter warehouse calibermind_warehouse resume; truncate table @table_name; copy into @table_name from @calibermind_gcp_stage[bucket_name][table_folder] pattern='*.gzip'; alter warehouse calibermind_warehouse suspend;
You should now see your CaliberMind data in your Snowflake instance. The COPY INTO command works for a single table, so you may need to specify the folder or table you want to load. If you have any questions, please contact your account manager or email our support team at moc.dnimrebilac@pleh.
OPTION 2 – Use Your Internal ETL Processes
With this option, CaliberMind will export all your datasets nightly to an encrypted GCS location. You can then use your own ETL service (such as Fivetran, Alteryx, SnapLogic, or MuleSoft) to pull this data into Snowflake.
To do this, set up a new connector for Google Cloud Storage within your ETL tool and follow your vendor’s instructions to complete the connection. We recommend creating a separate connector for each CaliberMind table you want to ingest. If you are unsure which tables your organization needs, please schedule a meeting with our support team to review your options.