Data warehouse integrations are available as a premium add-on for our Web Experimentation and Feature Experimentation module. For more information, please contact your Customer Success Manager.
This documentation explains how to setup a connection to your Databricks SQL warehouse. It contains several configuration steps to be activated within your Databricks account and we recommend that they be done by your Databricks administrator.
Key benefits
- Advanced Targeting: Utilize your data warehouse to create highly personalized and precise campaigns, boosting user engagement and conversion rates.
- Streamlined Data Management: Centralize event data in your data warehouse for easy access, enabling in-depth analysis and reporting.
- Data-Driven Decision-Making: Leverage insights from your data warehouse to inform data-driven decisions and enhance marketing strategies.
Considerations
Keep these things in mind when using this integration:
- Data Volume: Keep in mind the volume of data you plan to interact with, as it can affect query performance and costs.
- Query Complexity: Complex queries may require more time and resources to execute. Optimize your queries for efficiency.
- Data Privacy: Ensure compliance with data privacy regulations when handling user data within your warehouse.
- Access Control: Implement proper access controls to limit who can configure and use the integration within your organization.
- Data Schema: Maintain a clear and consistent data schema to facilitate data retrieval and analysis.
- Monitoring: Regularly monitor your data warehouse usage to manage costs and performance effectively.
- Documentation: Maintain documentation for queries, configurations, and integration processes to facilitate collaboration and troubleshooting.
Databricks
Prerequisites
To configure this integration, you need the following informations:
- Databricks personal access token (PAT)
- Proper access to create Databricks schema and grant access.
Setup
1. Create a personal access token (PAT)
Kameleoon will authenticate to your Databricks SQL warehouse with a personal access token. We recommend you create a Databricks service principal and then create a PAT for that service account.
Once a service principal is created you can generate a PAT for that principal with the Databricks CLI and using the Service Principal “Application Id” that you can find in the Service Principal management page of the Databricks UI.
databricks token-management create-obo-token {Service Principal Application Id} --lifetime-seconds 7776000 --comment "Token for Kameleoon service principal"
2. Create kameleoon_configuration schema
Create a dedicated schema for Kameleoon polling configuration within the catalog that contains the data that Kameleoon will be polling. This schema must be called “kameleoon_configuration”. You must also grant read and write access to the Service Principal that Kameleoon will be using. Here are some example commands:
CREATE SCHEMA my_catalog.kameleoon_configuration;
GRANT CREATE TABLE ON SCHEMA my_catalog.kameleoon_configuration TO `{Service Principal Application Id}`;
GRANT SELECT ON SCHEMA my_catalog.kameleoon_configuration TO `{Service Principal Application Id}`;
As in the above commands, you will need to replace {Service Principal Application Id} with your service principal’s application id.
3. Grant read access to your data
Kameleoon must have read access on the tables you wish to poll into the platform. This can be achieved by such commands as:
GRANT SELECT ON my_catalog.user_data.user_account_table TO `{Service Principal Application Id}`; // will grant read rights on a specific table
GRANT SELECT ON SCHEMA workspace.user_data TO `{Service Principal Application Id}`; // will grant read rights on all tables within a schema
4. Authorize Kameleoon IPs (Optional)
If you implement IP access lists then contact your Kameleoon account manager so that we may provide you with the list of Kameleoon IPs you need to authorize.