Hub
In Data Vault 2.0, a Hub represents a core business concept or entity. It contains a distinct list of unique business keys from across the enterprise, acting as an integration point. Hubs are designed to be stable and resilient to changes in source systems.
The primary purpose of a Hub is to store these business keys, along with metadata such as load timestamps and record sources, but not the descriptive attributes of the business entity itself (those are stored in Satellites).
Below are the properties you can use to define a Hub.
Hub Properties Overview
Property | Type | Description |
---|---|---|
name | String | The unique name for the Hub object. This name will be used for the generated table and in references by other objects (e.g., Links, Satellites). Example: 'HUB_MATERIAL' |
entity_type | String | Must be set to 'hub' to define this object as a Hub. |
concatenate_business_keys | Boolean | If true , and multiple source columns are mapped to single business key, their values will be concatenated before hashing to form the Hub's hash key. |
requires_source_business_key | Boolean | If true , each source definition within entity_sources must provide a non-empty source_business_key value. |
enable_refresh | Boolean | if true , hub receives new data. Set to false if the source contains not-changing (historical) data. See Shared Properties. |
target_business_key_columns | List of Strings | A list defining the names of the column(s) that will store the business key(s) in the generated Hub table. Example: ['MATERIAL_ID'] . |
entity_sources | List of Maps | Defines the list of sources that feed data into this Hub. Each item in the list represents a distinct source, following the structure outlined in Entity Sources Definition. |
Hub-Specific Property Details
concatenate_business_keys
- Type:
Boolean
- Description: This property determines how the Hub's hash key is generated when multiple source columns are mapped to the Hub's business keys.
- If
true
: The values from the multiple source columns (as defined inbusiness_key_mapping
for a givenentity_source
) are concatenated together (using a delimiter defined indata_vault_settings.yaml
) before the hash function is applied. This is typically used when a combination of several source attributes forms a single conceptual business key. Thetarget_business_key_columns
must list a single column name for this concatenated key. - If
false
: Each business key defined intarget_business_key_columns
is treated independently. If multiple target business keys are defined and mapped from source columns, each will result in a separate hash key calculation if not handled by other logic. This is common for Hubs with a single, clearly defined business key from the source or when business keys are sourced independently and are not part of a composite key from a single source instance.
- If
requires_source_business_key
- Type:
Boolean
- Description: This property enforces whether the
source_business_key
field must be populated within each definition in theentity_sources
list.- If
true
: Each source entry inentity_sources
must provide a non-empty string value for itssource_business_key
field. This is crucial for multi-master Hubs where the same business key value (e.g., customer ID '123') might originate from different source systems (e.g., 'CRM' and 'Billing'). Thesource_business_key
helps distinguish these instances, ensuring the uniqueness of the Hub's hash key when combined with the actual business key. - If
false
: Thesource_business_key
field within eachentity_source
definition must be empty or omitted. This is suitable when the business keys are globally unique or when the Hub is not designed to handle multi-master scenarios where the same key value can appear in different source systems with different meanings or contexts.
- If
target_business_key_columns
- Type:
List of Strings
- Description: This property defines the names of the column(s) that will store the business key(s) in the generated Hub table.
- If
concatenate_business_keys
istrue
and multiple source columns contribute to a single conceptual key, this list must contain the single target column name for that concatenated key. - If
concatenate_business_keys
isfalse
and you list multiple column names here, it implies the Hub represents a composite business key where each part is stored separately. Each of these columns would then need to be mapped from the source(s) in thebusiness_key_mapping
.
- If
Simple Example
This example defines a simple HUB_PRODUCT
with a single business key PRODUCT_SKU
sourced from one table.
entity_type: 'hub'
name: 'HUB_PRODUCT'
concatenate_business_keys: false
requires_source_business_key: false # Not a multi-master hub
enable_refresh: true
target_business_key_columns:
- 'PRODUCT_SKU'
entity_sources:
- urn:s2v:hub_source:inventory_products:
entity_source: '(INVENTORY_DB, dbo, Products)'
source_filter: "IsActive = 1"
source_system_configuration_urn: 'urn:s2v:source_setting:inventory_config'
business_key_mapping:
- PRODUCT_SKU:
- 'SKU'
source_business_key: '' # Empty as requires_source_business_key is false
Comprehensive Hub Example
This example defines HUB_MATERIAL
which integrates material identifiers from three different source contexts or systems, all mapping to a single target business key MATERIAL_ID
.
entity_type: 'hub'
name: 'HUB_MATERIAL'
concatenate_business_keys: false
requires_source_business_key: true # Assuming MATERIAL_ID is NOT globally unique across these sources
enable_refresh: true
target_business_key_columns:
- 'MATERIAL_ID'
entity_sources:
- urn:s2v:hub_source:src_erp_materials:
entity_source: '(ERP_DATA, dbo, MaterialMaster)'
source_filter: "TYPE = 'RAW'"
source_system_configuration_urn: 'urn:s2v:source_setting:erp_system_config'
business_key_mapping:
- MATERIAL_ID:
- 'MaterialInternalID'
source_business_key: 'ERP_DATA'
- urn:s2v:hub_source:src_legacy_parts:
entity_source: '(LEGACY_SYSTEM, parts_data, PartTable)'
source_filter: '' # No filter for this source
source_system_configuration_urn: 'urn:s2v:source_setting:legacy_system_config'
business_key_mapping:
- MATERIAL_ID:
- 'PartGlobalIdentifier'
source_business_key: 'LEGACY_SYSTEM'
- urn:s2v:hub_source:src_webshop_catalog:
entity_source: '(WEBSHOP_DB, catalog, ProductItems)'
source_filter: "IsPublished = true"
source_system_configuration_urn: 'urn:s2v:source_setting:webshop_config'
business_key_mapping:
- MATERIAL_ID:
- 'WebshopProductID'
source_business_key: 'WEBSHOP_DB'