LogoLogo
  • Datafold
  • Introduction
    • Data Diff
      • Continuous Integration
      • Manual Data Diff
      • Diff Results
    • Column-level lineage
      • Usage, popularity, & impact per table or column
    • Alerting
  • ⏱️Quickstart Guide
  • Getting Started
    • Data Warehouses
      • Snowflake
      • BigQuery
      • Redshift
      • Postgres
      • Databricks
    • Configuration
      • Indexing
      • Filtering
      • Profiling
      • Lineage
    • On-prem Deployment
      • AWS
      • GCP
    • SSO
      • Okta
      • Google OAuth
      • SAML
  • Integrations
    • Continuous Integration
      • Source Control with Git
        • GitHub
          • On-prem Github
        • GitLab
      • dbt Cloud
      • dbt Core / datafold-sdk
        • GitHub example
        • GitLab example
      • dbt Configurations
      • datafold-sdk
    • Alert Integrations
      • Slack integration
        • Slack Alerts
        • On-prem Slack Integration
      • Alerting webhooks
    • Data Apps
      • Mode
      • Hightouch
  • Developer
    • Datafold API
      • Alerting
      • GraphQL Metadata API
      • Data Diff
      • Error handling
    • Security
      • GDPR
      • Network Security
Powered by GitBook
On this page

Was this helpful?

  1. Getting Started
  2. Configuration

Indexing

PreviousConfigurationNextFiltering

Last updated 3 years ago

Was this helpful?

The process of indexing is pulling the schema from the database. This is done periodically, to make sure that new tables are being reflected in the Datafold catalog.

Depending on how often the schema is being altered, a recommended setting is between 10 and 30 minutes. If you for example have a custom Airflow DAG that creates new tables, then you want to make sure that these tables show in Datafold ASAP in order to be able to do a manual data diff. If you use , then you could set the frequency a bit lower.

By default, Datafold will crawl all the objects within the permissions that it has. You can exclude certain schema's from Datafold. Keep in mind, that tables in these schema's are also excluded from the lineage. We recommend to only ignore temporary tables and such.

the ci