Skip to main content

Configuration

Overview

The Stream2Vault application consists of number of components including a client, generation service and the authentication modules. You will be using Stream2Vault client to generate DataVault objects such as hubs, satellites and links.

There are two distinct components you will need to configure before starting work on generating the data vault. Please note that some of these parameters will determine naming conventions associated with your objects.

Data Vault Configuration

The data_vault_settings.yaml file is a YAML configuration file in the configuration/ directory of a project which lets you define environment variables, parameters, and other properties for the project and functions within it.

Properties

Data Vault Name (data_vault_name)

This is the name of your data vault project.

Business Key Sensitive Flag (is_business_key_case_sensitive)

Boolean value indicating if the business key is case sensitive.

Load Timestamp Column Name (load_timestamp_columns_name)

The name of the column in the source which contains the load timestamp. This column is used to build CDC logic.

Hashdiff Column Name (hashdiff_column_name)

The name of the column containing the hashed value of the full record. This column is used to detect changes in the incoming data.

Record Source Columns Name (record_source_column_name)

The name of the column containing the name of the record source, indicating where specific records originate from. Used in situations where multiple sources are being loaded into a single destination table in the data vault.

Hash Key Delimiter (hashkey_delimiter)

Hash Key Column Prefix (hash_key_column_prefix)

Source Business Key Column Name (source_business_key_column_name)

Name to be used to record source system column name in the data vault destination tables.

CDC Flag Column Name Column Name (cdc_flag_column_name)

Name to be used for CDC Flag column.

Hashing Algoithm (hashing_algorithm)

The algorithm used for hashing. Possible values include SHA1, ?

Use Binary Hashing Algorithm (use_binary_hashing_algorithm)

Boolean flag indicating if the binary hashing algorithm should be used.

Basic Example

data_vault_name: MY_DATA_VAULT
is_business_key_case_sensitive: false
load_timestamp_column_name: LOAD_DATE
hashdiff_column_name: HDIFF
record_source_column_name: REC_SRC
hashkey_delimiter: '##'
hash_key_column_prefix: 'HKY_'
source_business_key_column_name: SYSTEM_ID
cdc_flag_column_name: CDC_FLAG
hashing_algorithm: SHA1
use_binary_hashing_algorithm: false

Source System Configuration

The source_system_settings.yaml file is a YAML configuration file in the configuration/ directory of a project which lets you define environment variables, parameters, and other properties for source system used as an input into the stream2vault process of code generation.

Constraints:

  • Stream2Vault currently supports YAML type interface only.
  • There is only a single source system per project

Properties

Source System URN

This is the main group for storing the source system parameters. Currently only supports urn:s2v:source_setting:yaml_interface format.

Hashkey Escape Character (hashkey_escape_char)

Treating Empty values as NULL (empty_value_is_null)

Boolean flag indicating if empty values should be treated as NULL.

Trim Whitespace (trim_whitespaces)

Boolean flag indicating if whitespaces should be trimmed.

Load Timestamp Column Name (load_timestamp_column_name)

The name of the column in the source which contains the load timestamp. This column is used to build CDC logic.

CDC Flag Column Name Column Name (cdc_flag_column_name)

Name to be used for CDC Flag column.

Basic Example: YAML Input

source_system_settings:
- urn:s2v:source_setting:yaml_interface:
hashkey_escape_char: '\\'
empty_value_is_null: false
trim_whitespaces: true
load_timestamp_column_name: CDC_TIMESTAMP
cdc_flag_column_name: CDC_FLAG