Use Databricks as a source: pulling events for reporting

Written by Julie Trenque

Updated on 04/15/2025

3 min

Advanced

Manage your integrations

Analytics

Automation

CDP

CMP

CMS/CRM/E-commerce

Data Warehouses

Developers

Was this content useful?

Data warehouse integrations are available as a premium add-on for our Web Experimentation and Feature Experimentation module. For more information, please contact your Customer Success Manager.

Once you’ve activated Databricks for a specific project, you can leverage it to create goals in Kameleoon. These goals are designed to use conversions data directly from your Databricks database. Here’s how to create a goal using Databricks:

Navigate to the Goals dashboard by clicking on Configure in the navigation menu, followed by clicking Goals.

Inside the Goals dashboard, click on New Goal.

In the pop-up window, you need to provide the following details:

  • Name: Give your goal a descriptive name to identify its purpose.
  • Type: For the Type field, select “Data Warehouse Tracking”.
  • Data Warehouse: Choose “Databricks” for the Data warehouse field.
  • Project: Select the project from the available options. Only projects with Databricks activated are listed.
  • Click Next to proceed.

In the next window, you’ll need to provide additional details to complete the goal configuration:

  • Frequency: Select how often the task should run by setting your desired time interval among our predefined values. This determines how many times the task will be executed.
  • Databricks Catalog: Enter the Name of the Databricks “catalog” that contains the schemas you wish Kameleoon to read from (the top level folder).
  • Schema: Enter the schema containing the tables this ingestion task will query from.
  • Query: Enter a SQL query with two columns – the first column should contain the user ID (or kameleoonVisitorCode), and the second column should hold the value to enrich the user with.
  • Click Validate to save your goal configuration.

Query format

The query must adhere to a specific format:

SELECT visitor_id, conversion_timestamp FROM your_events_table

Where visitor_id is the column representing the unique id of your visitors and conversion_timestamp is a column that represents the exact time at which the conversion took place. In Databricks, the conversion_timestamp column must be a Timestamp type column.

If you want to associate a revenue to each conversion, the query should adhere to an alternate format:

SELECT visitor_id, conversion_timestamp, revenue FROM your_events_table

Where revenue is a column containing the revenue for each conversion.

For more complex queries you can adhere to this format by formulating a sub-query as such:

SELECT visitor_id, conversion_timestamp, revenue FROM ( {your_original_query} ) AS subquery

The query you input will be run every hour in your Databricks warehouse, appended with a “WITH” condition that filters on the timestamps. However please note that while your conversions are polled every hour, they are only merged once a day into your experiment results.

Using your Databricks goal

With the goal created, you can now incorporate it into your Kameleoon campaigns. When setting up an experiment or personalization, you’ll have the option to select this goal in the configuration steps, allowing you to track and analyze specific conversions directly from your Databricks database. These conversion are merged with your experiment results once a day, so all conversions polled on a given day will be visible the following morning.

To learn how to set up a goal in an experiment, please refer to this article.

  • In this article :