Transformations Reference
This page is a complete reference for all transformations supported by Stream2Vault. Transformations allow you to apply Snowflake SQL functions to column values directly within your YAML model — without pre-processing data outside of S2V.
Applicability
Transformations can be used in two contexts:
| Context | Where | YAML location |
|---|---|---|
| Business Key Mapping | Applied to source columns before they are hashed into a Business Key | business_key_mapping → column transformations |
| Masterdata Rules | Applied to columns within masterdata/mart rule expressions | masterdata entity rule columns |
The table below shows which transformations are available in each context.
Full Transformation List
Notation
All transformations follow the column: - <transformation> syntax:
business_key_mapping:
- BK:
- MY_COLUMN:
- left: 8 # single transformation
- cast: VARCHAR # chained — applied after left
Multiple transformations on the same column are applied in the order listed (top to bottom).
Available Transformations
| Transformation | Arguments | Details | Business Key | Masterdata |
|---|---|---|---|---|
substring | 2 | Extracts a portion of a string starting at a given position for a specified length. Parameters: position (int), length (int) Example: substring: (1,3) | Yes | Yes |
left | 1 | Returns the leftmost N characters of a string. Parameters: length (int) Example: left: 8 | Yes | Yes |
right | 1 | Returns the rightmost N characters of a string. Parameters: length (int) Example: right: 5 | Yes | Yes |
leftpadding | 2 | Left-pads a string to a specified width using a given character. Parameters: width (int), pad_char (str) Example: leftpadding: (10,'0') | Yes | Yes |
rightpadding | 2 | Right-pads a string to a specified width using a given character. Parameters: width (int), pad_char (str) Example: rightpadding: (10,' ') | Yes | Yes |
replace_by_values | 2 | Replaces all occurrences of a string literal with another string literal. Parameters: old_value (str), new_value (str) Example: replace_by_values: ('-',' ') | Yes | Yes |
replace_by_column_and_value | 2 | Replaces all occurrences of a column's value with a string literal. Parameters: column (column), new_value (str) Example: replace_by_column_and_value: (COL,'x') | Yes | Yes |
replace_by_value_and_column | 2 | Replaces all occurrences of a string literal with a column's value. Parameters: old_value (str), column (column) Example: replace_by_value_and_column: ('-',COL) | Yes | Yes |
replace_by_columns | 2 | Replaces all occurrences of one column's value with another column's value. Parameters: source_col (column), replacement_col (column) Example: replace_by_columns: (COL1,COL2) | Yes | Yes |
ltrim | 0 | Removes leading whitespace from a string. Parameters: — Example: ltrim | Yes | Yes |
rtrim | 0 | Removes trailing whitespace from a string. Parameters: — Example: rtrim | Yes | Yes |
trim | 0 | Removes both leading and trailing whitespace from a string. Parameters: — Example: trim | Yes | Yes |
cast | 1 | Casts a value to the specified SQL data type. Parameters: target_type (data_type) Example: cast: VARCHAR | No | Yes |
initcap | 0 | Capitalizes the first letter of each word in a string. Parameters: — Example: initcap | No | Yes |
capitalization | 1 | Converts a string to uppercase (UPPER) or lowercase (LOWER).Parameters: direction (UPPER or LOWER) Example: capitalization: UPPER | No | Yes |
concatenate | varies | Joins multiple input columns using a separator. Parameters: separator (str) Example: combine_type: (concatenate, ', ') | No | Yes |
coalesce | varies | Returns the first non-null value from the input columns. Parameters: — Example: combine_type: (coalesce) | No | Yes |
Transformation Details
substring
Extracts a substring. Arguments: (start, end) — both integers.
- MY_COL:
- substring: (1,3) # extracts characters 1 through 3
left
Returns the first N characters. Argument: integer.
- MY_COL:
- left: 8 # first 8 characters
right
Returns the last N characters. Argument: integer.
- MY_COL:
- right: 4 # last 4 characters
leftpadding
Left-pads the value to a given length with the specified character. Arguments: (length, char).
- MY_COL:
- leftpadding: (10,'0') # pad with zeros on the left up to 10 chars
rightpadding
Right-pads the value to a given length with the specified character. Arguments: (length, char).
- MY_COL:
- rightpadding: (10,' ') # pad with spaces on the right up to 10 chars
replace_by_values
Replaces all occurrences of a literal string with another literal string. Arguments: (search, replace).
- MY_COL:
- replace_by_values: ('-',' ') # replace all '-' with ' '
replace_by_column_and_value
Replaces occurrences using a source column as the search value and a literal string as the replacement. Arguments: (column, literal).
- MY_COL:
- replace_by_column_and_value: (OTHER_COL,'x') # REPLACE(MY_COL, OTHER_COL, 'x')
replace_by_value_and_column
Replaces occurrences of a literal string with the value of a source column. Arguments: (literal, column).
- MY_COL:
- replace_by_value_and_column: ('-',OTHER_COL) # REPLACE(MY_COL, '-', OTHER_COL)
replace_by_columns
Replaces occurrences of one column's value with another column's value. Arguments: (search_col, replace_col).
- MY_COL:
- replace_by_columns: (COL1,COL2) # REPLACE(MY_COL, COL1, COL2)
ltrim
Removes leading whitespace. No arguments.
- MY_COL:
- ltrim
rtrim
Removes trailing whitespace. No arguments.
- MY_COL:
- rtrim
trim
Removes both leading and trailing whitespace. No arguments.
- MY_COL:
- trim
cast (Masterdata only)
Casts the column value to a Snowflake data type. Argument: a valid Snowflake data type string.
- MY_COL:
- cast: VARCHAR # CAST(MY_COL AS VARCHAR)
- cast: NUMBER(10,2) # CAST(MY_COL AS NUMBER(10,2))
initcap (Masterdata only)
Capitalizes the first letter of each word (INITCAP). No arguments.
- MY_COL:
- initcap
capitalization (Masterdata only)
Converts the value to UPPER or LOWER case. Argument: UPPER or LOWER.
- MY_COL:
- capitalization: UPPER
concatenate (Masterdata only)
Joins multiple column values together using a specified separator character.
Syntax: (concatenate, <separator>)
Generated SQL: CONCAT_WS(<separator>, column1, column2, ...)
input_columns:
- 'STREET'
- 'CITY'
- 'ZIP_CODE'
combine_type: (concatenate, ', ')
coalesce (Masterdata only)
Returns the first non-null value from a list of input columns.
Syntax: (coalesce)
Generated SQL: COALESCE(column1, column2, ...)
input_columns:
- 'PRIMARY_PHONE'
- 'SECONDARY_PHONE'
combine_type: (coalesce)
Chaining Transformations
Transformations can be chained on the same column — they are applied in order, with each transformation receiving the output of the previous one:
input_columns:
- CUSTOMER_NAME:
- trim
- left: 50
- capitalization: 'UPPER'
This produces the SQL expression: UPPER(LEFT(TRIM(CUSTOMER_NAME), 50))
Complete Example
The following example demonstrates various transformation types applied within a Masterdata rule:
rules:
- FORMATTED_ID:
- src:
input_columns:
- RAW_ID:
- cast: 'VARCHAR'
- leftpadding: (10,'0')
- CLEAN_NAME:
- src:
input_columns:
- FIRST_NAME:
- trim
- initcap
- LAST_NAME:
- trim
- initcap
combine_type: (concatenate, ' ')
- REGION_CODE:
- src:
input_columns:
- COUNTRY_NAME:
- left: 3
- capitalization: 'UPPER'
- replace_by_values: (' ','_')