Data Vault Configuration
This document outlines the global configuration settings for your data vault implementation, managed through the data_vault_settings.yaml
file. These parameters define core behaviors and naming conventions across your data vault, ensuring consistency and proper functioning.
Properties
The table below provides a comprehensive overview of each property available in the data_vault_settings.yaml
file, including its expected data type and a detailed description of its purpose.
Property Name | Expected Type | Description |
---|---|---|
data_vault_name | string | The name of your data vault. |
is_business_key_case_sensitive | boolean | Specifies if business keys should be treated as case-sensitive. Set to true if you need distinct entries for keys that differ only by case (e.g., "ABC" vs. "abc"). |
load_timestamp_column_name | string | The name for the load timestamp column in target data vault object. |
hashdiff_column_name | string | The name for the hash difference column in target data vault object. |
record_source_column_name | string | The name for the column indicating the source of the record in target data vault object. |
source_business_key_column_name | string | The name for the column storing the source system business key in target data vault object. |
cdc_flag_column_name | string | The name for the CDC (Change data Capture) flag column in target data vault object. |
hashkey_delimiter | string | The delimiter used when concatenating multiple columns to form a hash key. |
hash_key_column_prefix | string | The prefix to be added to hash key column names. |
hashing_algorithm | string | The hashing algorithm to use (e.g., "HASH", "MD5", "SHA1", "SHA2", "NOHASH") |
use_binary_hashing_algorithm | boolean | Specifies whether the chosen hashing algorithm should produce binary output. |
multi_source_databases | boolean | Indicates if the data vault integrates data from multiple source databases (affects source tuple format in entity_source property). |
Example
Below is an example of a data_vault_settings.yaml
file with common configurations. You can adapt these settings to match your specific data vault environment and requirements.
data_vault_name: 'My Data Vault Model'
is_business_key_case_sensitive: false
load_timestamp_column_name: 'LOAD_DATE'
hashdiff_column_name: 'HASH_DIFF'
record_source_column_name: 'RECORD_SOURCE'
hashkey_delimiter: '##'
hash_key_column_prefix: 'HKEY_'
source_business_key_column_name: 'SRC_BK'
cdc_flag_column_name: 'CDC_FLAG'
hashing_algorithm: 'HASH'
use_binary_hashing_algorithm: false
multi_source_databases: false