LogoLogo
  • Datafold
  • Introduction
    • Data Diff
      • Continuous Integration
      • Manual Data Diff
      • Diff Results
    • Column-level lineage
      • Usage, popularity, & impact per table or column
    • Alerting
  • ⏱️Quickstart Guide
  • Getting Started
    • Data Warehouses
      • Snowflake
      • BigQuery
      • Redshift
      • Postgres
      • Databricks
    • Configuration
      • Indexing
      • Filtering
      • Profiling
      • Lineage
    • On-prem Deployment
      • AWS
      • GCP
    • SSO
      • Okta
      • Google OAuth
      • SAML
  • Integrations
    • Continuous Integration
      • Source Control with Git
        • GitHub
          • On-prem Github
        • GitLab
      • dbt Cloud
      • dbt Core / datafold-sdk
        • GitHub example
        • GitLab example
      • dbt Configurations
      • datafold-sdk
    • Alert Integrations
      • Slack integration
        • Slack Alerts
        • On-prem Slack Integration
      • Alerting webhooks
    • Data Apps
      • Mode
      • Hightouch
  • Developer
    • Datafold API
      • Alerting
      • GraphQL Metadata API
      • Data Diff
      • Error handling
    • Security
      • GDPR
      • Network Security
Powered by GitBook
On this page
  • Prepare credentials
  • Generate a Personal Access Token
  • Retrieve SQL endpoint settings
  • Create a Data Source

Was this helpful?

  1. Getting Started
  2. Data Warehouses

Databricks

PreviousPostgresNextConfiguration

Last updated 2 years ago

Was this helpful?

Prepare credentials

Generate a Personal Access Token

Visit Settings → User Settings, and then switch to Personal Access Tokens tab.

Then, click Generate new token. Save the generated token somewhere, we will need it later on.

Retrieve SQL endpoint settings

In SQL mode, navigate to SQL Endpoints.

Choose the preferred endpoint and copy the following fields values from its Connection Details tab:

  • Server hostname

  • HTTP path

Create a Data Source

By opening Admin → Settings at Datafold, you will automatically be directed to Data Sources tab. Click + New Data Source and choose Databricks. Connection parameters are in the lower part of the popup window.

Database parameter has the format of CATALOG_NAME.DATABASE_NAME.

In most cases, CATALOG_NAME is hive_metastore.

Click Save. Your data source is ready!

After setting permissions in your data source, move on to

IP Whitelisting ->
Configuration ->