Skip to main content

Data Vault Configuration

This document outlines the global configuration settings for your data vault implementation, managed through the data_vault_settings.yaml file. These parameters define core behaviors and naming conventions across your data vault, ensuring consistency and proper functioning.

Properties

The table below provides a comprehensive overview of each property available in the data_vault_settings.yaml file, including its expected data type and a detailed description of its purpose.

Property NameExpected TypeDescription
data_vault_namestringThe name of your data vault.
is_business_key_case_sensitivebooleanSpecifies if business keys should be treated as case-sensitive. Set to true if you need distinct entries for keys that differ only by case (e.g., "ABC" vs. "abc").
load_timestamp_column_namestringThe name for the load timestamp column in target data vault object.
hashdiff_column_namestringThe name for the hash difference column in target data vault object.
record_source_column_namestringThe name for the column indicating the source of the record in target data vault object.
source_business_key_column_namestringThe name for the column storing the source system business key in target data vault object.
cdc_flag_column_namestringThe name for the CDC (Change data Capture) flag column in target data vault object.
hashkey_delimiterstringThe delimiter used when concatenating multiple columns to form a hash key.
hash_key_column_prefixstringThe prefix to be added to hash key column names.
hashing_algorithmstringThe hashing algorithm to use (e.g., "HASH", "MD5", "SHA1", "SHA2", "NOHASH")
use_binary_hashing_algorithmbooleanSpecifies whether the chosen hashing algorithm should produce binary output.
multi_source_databasesbooleanIndicates if the data vault integrates data from multiple source databases (affects source tuple format in entity_source property).

Example

Below is an example of a data_vault_settings.yaml file with common configurations. You can adapt these settings to match your specific data vault environment and requirements.

data_vault_name: 'My Data Vault Model'
is_business_key_case_sensitive: false
load_timestamp_column_name: 'LOAD_DATE'
hashdiff_column_name: 'HASH_DIFF'
record_source_column_name: 'RECORD_SOURCE'
hashkey_delimiter: '##'
hash_key_column_prefix: 'HKEY_'
source_business_key_column_name: 'SRC_BK'
cdc_flag_column_name: 'CDC_FLAG'
hashing_algorithm: 'HASH'
use_binary_hashing_algorithm: false
multi_source_databases: false