Use Databricks as a source — pulling events for reporting
Once you’ve activated Databricks for a specific project, you can use it to create goals in Kameleoon. These goals are designed to utilize conversions data directly from your Databricks database.
To create a goal using Databricks:
- Click Settings > Goals > New goal.
- Enter the following information:
- Name: Give your goal a descriptive name.
- Type: Select Data Warehouse Tracking.
- Data warehouse: Choose Databricks.
- Project: Select your desired project. Only projects with Databricks enabled are listed.
- Click Next.
- In the next window, provide additional details:
- Frequency: Select how often the task should run.
- Databricks catalog: Enter the name of the Databricks catalog that contains the schemas you wish Kameleoon to read from.
- Query: Enter a SQL query with two columns—the first column should contain the user ID (or
kameleoonVisitorCode
), and the second should contain the corresponding value that you want to add as additional information (enrichment) for that user.
- Click Validate to save your goal configuration.
Data warehouse retention period: For an event to be polled, Kameleoon required that it remains accessible by your input query at least 72 hours after the event occurred.
Query format
The query must adhere to a specific format:
SELECT visitor_id, conversion_timestamp FROM your_events_table
Where visitor_id
is the column representing your visitors' unique ID, and conversion_timestamp
is a column representing the exact time at which the conversion took place. In Databricks, the conversion_timestamp
column must be a Timestamp type column.
If you want to associate a revenue to each conversion, the query should adhere to an alternate format:
SELECT visitor_id, conversion_timestamp, revenue FROM your_events_table
Where revenue
is a column containing the revenue for each conversion.
For more complex queries you can adhere to this format by formulating a sub-query as such:
SELECT visitor_id, conversion_timestamp, revenue FROM ( {your_original_query} ) AS subquery
Your query will run every hour in your Databricks warehouse, with an added WITH
clause that filters by timestamps. Keep in mind that although conversions are collected hourly, they are only merged into your experiment results once per day.
Using your Databricks goal
Now that you've created a goal, you can incorporate it into your Kameleoon campaigns. When setting up an experiment or personalization, you can select the goal, allowing you to track and analyze specific conversions in your Databricks database. These conversions are merged with your experiment results once per day, so all conversions polled on a given day will be available the following morning.
To learn how to set up a goal in an experiment, please refer to this article.