Skip to main content

Hub

In Data Vault 2.0, a Hub represents a core business concept or entity. It contains a distinct list of unique business keys from across the enterprise, acting as an integration point. Hubs are designed to be stable and resilient to changes in source systems.

The primary purpose of a Hub is to store these business keys, along with metadata such as load timestamps and record sources, but not the descriptive attributes of the business entity itself (those are stored in Satellites).

Below are the properties you can use to define a Hub.

Hub Properties Overview

PropertyTypeDescription
nameStringThe unique name for the Hub object. This name will be used for the generated table and in references by other objects (e.g., Links, Satellites). Example: 'HUB_MATERIAL'
entity_typeStringMust be set to 'hub' to define this object as a Hub.
concatenate_business_keysBooleanIf true, and multiple source columns are mapped to single business key, their values will be concatenated before hashing to form the Hub's hash key.
requires_source_business_keyBooleanIf true, each source definition within entity_sources must provide a non-empty source_business_key value.
enable_refreshBooleanif true, hub receives new data. Set to false if the source contains not-changing (historical) data. See Shared Properties.
target_business_key_columnsList of StringsA list defining the names of the column(s) that will store the business key(s) in the generated Hub table. Example: ['MATERIAL_ID'].
entity_sourcesList of MapsDefines the list of sources that feed data into this Hub. Each item in the list represents a distinct source, following the structure outlined in Entity Sources Definition.

Hub-Specific Property Details

concatenate_business_keys

  • Type: Boolean
  • Description: This property determines how the Hub's hash key is generated when multiple source columns are mapped to the Hub's business keys.
    • If true: The values from the multiple source columns (as defined in business_key_mapping for a given entity_source) are concatenated together (using a delimiter defined in data_vault_settings.yaml) before the hash function is applied. This is typically used when a combination of several source attributes forms a single conceptual business key. The target_business_key_columns must list a single column name for this concatenated key.
    • If false: Each business key defined in target_business_key_columns is treated independently. If multiple target business keys are defined and mapped from source columns, each will result in a separate hash key calculation if not handled by other logic. This is common for Hubs with a single, clearly defined business key from the source or when business keys are sourced independently and are not part of a composite key from a single source instance.

requires_source_business_key

  • Type: Boolean
  • Description: This property enforces whether the source_business_key field must be populated within each definition in the entity_sources list.
    • If true: Each source entry in entity_sources must provide a non-empty string value for its source_business_key field. This is crucial for multi-master Hubs where the same business key value (e.g., customer ID '123') might originate from different source systems (e.g., 'CRM' and 'Billing'). The source_business_key helps distinguish these instances, ensuring the uniqueness of the Hub's hash key when combined with the actual business key.
    • If false: The source_business_key field within each entity_source definition must be empty or omitted. This is suitable when the business keys are globally unique or when the Hub is not designed to handle multi-master scenarios where the same key value can appear in different source systems with different meanings or contexts.

target_business_key_columns

  • Type: List of Strings
  • Description: This property defines the names of the column(s) that will store the business key(s) in the generated Hub table.
    • If concatenate_business_keys is true and multiple source columns contribute to a single conceptual key, this list must contain the single target column name for that concatenated key.
    • If concatenate_business_keys is false and you list multiple column names here, it implies the Hub represents a composite business key where each part is stored separately. Each of these columns would then need to be mapped from the source(s) in the business_key_mapping.

Simple Example

This example defines a simple HUB_PRODUCT with a single business key PRODUCT_SKU sourced from one table.

entity_type: 'hub'
name: 'HUB_PRODUCT'

concatenate_business_keys: false
requires_source_business_key: false # Not a multi-master hub
enable_refresh: true

target_business_key_columns:
- 'PRODUCT_SKU'

entity_sources:
- urn:s2v:hub_source:inventory_products:
entity_source: '(INVENTORY_DB, dbo, Products)'
source_filter: "IsActive = 1"
source_system_configuration_urn: 'urn:s2v:source_setting:inventory_config'
business_key_mapping:
- PRODUCT_SKU:
- 'SKU'
source_business_key: '' # Empty as requires_source_business_key is false

Comprehensive Hub Example

This example defines HUB_MATERIAL which integrates material identifiers from three different source contexts or systems, all mapping to a single target business key MATERIAL_ID.

entity_type: 'hub'
name: 'HUB_MATERIAL'

concatenate_business_keys: false
requires_source_business_key: true # Assuming MATERIAL_ID is NOT globally unique across these sources
enable_refresh: true

target_business_key_columns:
- 'MATERIAL_ID'

entity_sources:
- urn:s2v:hub_source:src_erp_materials:
entity_source: '(ERP_DATA, dbo, MaterialMaster)'
source_filter: "TYPE = 'RAW'"
source_system_configuration_urn: 'urn:s2v:source_setting:erp_system_config'
business_key_mapping:
- MATERIAL_ID:
- 'MaterialInternalID'
source_business_key: 'ERP_DATA'

- urn:s2v:hub_source:src_legacy_parts:
entity_source: '(LEGACY_SYSTEM, parts_data, PartTable)'
source_filter: '' # No filter for this source
source_system_configuration_urn: 'urn:s2v:source_setting:legacy_system_config'
business_key_mapping:
- MATERIAL_ID:
- 'PartGlobalIdentifier'
source_business_key: 'LEGACY_SYSTEM'

- urn:s2v:hub_source:src_webshop_catalog:
entity_source: '(WEBSHOP_DB, catalog, ProductItems)'
source_filter: "IsPublished = true"
source_system_configuration_urn: 'urn:s2v:source_setting:webshop_config'
business_key_mapping:
- MATERIAL_ID:
- 'WebshopProductID'
source_business_key: 'WEBSHOP_DB'