Skip to main content
Version: 1.3.0

How to build an Aggregated Hub Satellite?

This tutorial will guide you through building a complete Aggregated Hub Satellite object YAML definition step-by-step. An Aggregated Hub Satellite holds descriptive attributes for a parent Hub and tracks their history.

We'll build an Aggregated Hub Satellite named SAT_SAP_CEPC connected to a HUB_PROFIT_CENTER similar to the regular Satellite example highlighting the key differences.

  1. Aggregated Satellite Name (name): This is the unique identifier for your Aggregated Hub Satellite object.

    name: 'SAT_SAP_CEPC' # 1. Aggregated Satellite Name
  2. Entity Type (entity_type): Specifies that this object is an Aggregated Hub Satellite.

    name: 'SAT_SAP_CEPC'
    entity_type: 'hubsat_agg' # 2. Entity Type - this is the difference towards a regular Hub Satellite
  3. Connected Entity (connected_entity): The name of the parent Hub to which this Satellite is attached. This Hub must be defined in your data vault model.

    name: 'SAT_SAP_CEPC'
    entity_type: 'hubsat_agg'
    connected_entity: 'HUB_PROFIT_CENTER' # 3. Connected Entity (must be a Hub)
  4. Enable Refresh (enable_refresh): A boolean flag indicating whether the Satellite receives new data. Set to true if the object can receive new data (typical for Satellites), or false if it only contains historical records and will not be updated.

    name: 'SAT_SAP_CEPC'
    entity_type: 'hubsat_agg'
    connected_entity: 'HUB_PROFIT_CENTER'
    enable_refresh: true # 4. Enable Refresh
  5. Skip Hashdiff Comparison (skip_hashdiff_comparison): A boolean flag. If true, S2V will skip comparing the hashdiff of incoming records with existing records in the Satellite. This means every incoming record for a given business key will be treated as a new version, regardless of whether its descriptive attributes have changed.

    name: 'SAT_SAP_CEPC'
    entity_type: 'hubsat_agg'
    connected_entity: 'HUB_PROFIT_CENTER'
    enable_refresh: true
    skip_hashdiff_comparison: false # 5. Skip Hashdiff Comparison
  6. Ordering Columns (ordering_columns): A list of column names from the source that determine the sequence of records when multiple changes occur for the same business key at the same load datetime. This helps in correctly identifying the latest change e.g. if the entire batch od records contains the same load datetime. Can be an empty list [] if not needed.

    name: 'SAT_SAP_CEPC'
    entity_type: 'hubsat_agg'
    connected_entity: 'HUB_PROFIT_CENTER'
    enable_refresh: true
    skip_hashdiff_comparison: false
    ordering_columns: [] # 6. Ordering Columns (empty list in this case)
  7. Entity Source (entity_source): For an Aggregated Hub Satellites, this defines the single source that feeds data into this Satellite. It's a tuple-keyed map.

    • 7.1. Source Location Tuple (Key): The key of the entity_source map is a tuple (SOURCE_DATABASE, SOURCE_SCHEMA, SOURCE_TABLE) or (SOURCE_SCHEMA, SOURCE_TABLE) specifying the physical location of the source data.
    # ... (previous properties)
    entity_source: # 8. Entity Source
    (SAP_MASTERDATA, CEPC): # 7.1. Source Location Tuple
    • 7.2. Source Filter (source_filter): An optional SQL condition applied to the source data.
    # ... (previous properties)
    entity_source:
    (SAP_MASTERDATA, CEPC):
    source_filter: '' # 7.2. Nested Source Filter
    • 7.3. Source System Configuration URN (source_system_configuration_urn): Links this source to its global system settings.
    # ... (previous properties)
    entity_source:
    (SAP_MASTERDATA, CEPC):
    source_filter: ''
    source_system_configuration_urn: 'urn:s2v:source_setting:SAP' # 7.3. Source System Config URN
    • 7.4. Business Key Mapping (business_key_mapping): Defines how the business key(s) of the parent Hub (HUB_PROFIT_CENTER) are identified from the columns of this Satellite's source table. This mapping ensures the Satellite record is correctly linked to its parent Hub record.
      • The key (e.g., PROFIT_CENTER_BK) is the name of the business key column in the parent HUB_PROFIT_CENTER.
      • The value (e.g., PRCTR and KOKRS) is the list of corresponding column(s) in the Satellite's source table (CEPC).
    # ... (previous properties)
    entity_source:
    (SAP_MASTERDATA, CEPC):
    source_filter: ''
    source_system_configuration_urn: 'urn:s2v:source_setting:SAP'
    business_key_mapping: # 7.4. Business Key Mapping
    - PROFIT_CENTER_BK: # Business key column name in HUB_PROFIT_CENTER
    - 'PRCTR' # Source column from CEPC
    - "KOKRS" # Source column from CEPC
    • 7.5. Source Business Key (source_business_key): If the parent Hub is a multi-master Hub, the Satellite needs to align with it's specific source settings, thought inherit the value from the hub.
    # ... (previous properties)
    entity_source:
    (SAP_MASTERDATA, CEPC):
    source_filter: ''
    source_system_configuration_urn: 'urn:s2v:source_setting:SAP'
    business_key_mapping:
    - PROFIT_CENTER_BK:
    - 'PRCTR'
    - 'KOKRS'
    source_business_key: '' # 7.5. Source Business Key
    Lookup Mapping for Satellites

    Typically, a Hub Satellite's source is also a Hub's source and data directly contains the business key of its parent Hub. In such cases, you use business_key_mapping to link them, as shown in the example above.

    However, sometimes the Satellite's source data might not have the parent Hub's business key directly. Instead, it might have a different identifier (like a foreign key or an alternative ID). If this identifier can be used to find the correct Hub business key by looking it up in one of the parent Hub's own source tables, then you should use lookup_mapping for the Satellite.

    With lookup_mapping, you tell S2V:

    1. Which of the parent Hub's sources to look into (using the Hub's source URN).
    2. Which column(s) from the Satellite's source to use for the lookup.
    3. Which column(s) in the Hub's source table to match against.

    This allows S2V to resolve the correct parent Hub key for the Satellite record. For a detailed guide on both mapping types, please see the "Business Key and Lookup Mappings" documentation.

  8. Historized Columns (historized_columns): A list of column names from the source that contain descriptive attributes whose changes over time should be tracked in the Satellite. Each change to these attributes for a given business key will result in a new record in the Satellite.

    # ... (previous properties)
    historized_columns: # 8. Historized Columns
    - 'ABTEI'
    - 'VERAK'
    - 'USNAM'
    - 'ERSDA'
    - 'WAERS'
  9. Non-Historized Columns (non_historized_columns): A list of column names from the source that contain descriptive attributes whose current value should be stored, but whose history is not tracked.

    # ... (previous properties)
    non_historized_columns: # 9. Non-Historized Columns
    - "PRCTR"
    - "KOKRS"

Complete Hub Satellite YAML

Here is the complete YAML definition for our SAT_SAP_CEPC Hub Satellite:

name: 'SAT_SAP_CEPC'
entity_type: 'hubsat_agg' # The key difference for resulting Code compared to a regular hubsat entity
enable_refresh: true
skip_hashdiff_comparison: false
ordering_columns: [] # Optional: for sequencing records with same load timestamp

connected_entity: 'HUB_PROFIT_CENTER' # Name of the parent Hub
entity_source:
(SAP_MASTERDATA, CEPC): # Source: (Schema, Table)
source_filter: '' # Optional filter
source_system_configuration_urn: 'urn:s2v:source_setting:SAP'
source_business_key: '' # Optional source system identifier
business_key_mapping: # How to get the parent Hub's business key
- PROFIT_CENTER_BK: # Business key column name in HUB_PRODUCT
- "PRCTR" # Corresponding column in CEPC table
- "KOKRS" # Corresponding column in CEPC table

historized_columns: # Attributes whose history is tracked
- 'ABTEI'
- 'VERAK'
- 'USNAM'
- 'ERSDA'
- 'WAERS'

non_historized_columns: # Attributes whose history is NOT tracked (current value stored)
- "PRCTR"
- "KOKRS"