Skip to main content
Version: Next

Masterdata

A Masterdata object represents a consolidated business view of a data vault core entity, the Hub. While Hubs store unique business keys and Satellites capture descriptive attributes from individual source systems, Masterdata brings these together — combining data from one or more Satellites into a single, flattened structure that reflects the business state of an entity. This eliminates the need for consumers to join multiple Satellites themselves and provides a stable, business-readable structure suitable for direct use in reporting, analytics, and operational systems.

Key Characteristics:

  • Hub-Anchored: Always references exactly one Hub as its base entity, inheriting its hash key, load timestamp, record source, and business key columns.
  • Multi-Satellite Integration: Combines descriptive columns from one or more Satellites (describing entities) into a single flat table.
  • Rule-Based Column Derivation: Target columns are defined via rules that support direct mappings, transformations, multi-column combine operations (concatenate, coalesce), constants, and references to other derived columns.
  • Controlled Historization: Historization is controlled by the enable_history parameter.
warning

Currently, only the as-is view (enable_history: false) is supported. Full historic view support is planned for an upcoming release.

Role in Data Vault:

Masterdata objects sit at the consumption layer of the Data Vault. They serve as pre-joined, business-readable structures that abstract the complexity of the raw vault (Hubs + Satellites) from downstream consumers such as BI tools, analytics platforms, and operational systems. By centralizing the join and transformation logic in the model definition, Masterdata objects ensure consistent, reusable output without duplicating transformation code across reports.

Simple Masterdata Example:

This example defines a Masterdata object for CUSTOMER, combining data from two Satellites into a single flat table anchored to HUB_CUSTOMER.

name: 'CUSTOMER'
entity_type: 'masterdata'
enable_history: false
description: 'Consolidated customer master data combining core details and address information'
enable_refresh: true

target_entity_columns:
- CUSTOMER_NAME:
data_type: 'STRING'
default: 'N/A'
description: 'Standardized full name of the customer'
- FULL_ADDRESS:
data_type: 'STRING'
default: 'N/A'
description: 'Complete mailing address concatenated from street, zip, and city'
- COUNTRY_CODE:
data_type: 'STRING'
default: 'N/A'
description: 'ISO 2-letter country code'
- CUSTOMER_SEGMENT:
data_type: 'STRING'
default: 'UNKNOWN'
description: 'Customer classification segment'

base_entity: 'HUB_CUSTOMER'

describing_entities:
- details: 'SAT_CUSTOMER_DETAILS'
- addr: 'SAT_CUSTOMER_ADDRESS'

match_entities: true

rules:
- CUSTOMER_NAME:
- details:
input_columns:
- FULL_NAME:
- trim
- capitalization: 'UPPER'
- addr:
input_columns:
- CONTACT_NAME:
- trim
- capitalization: 'UPPER'

- FULL_ADDRESS:
- addr:
input_columns:
- STREET:
- trim
- ZIP_CODE:
- trim
- CITY_NAME:
- trim
combine_type: (concatenate, ', ')

- COUNTRY_CODE:
- addr:
input_columns:
- 'COUNTRY':
- left: 2
- capitalization: 'UPPER'

- CUSTOMER_SEGMENT:
- details:
input_columns:
- 'SEGMENT_CODE'
  • For a detailed step-by-step guide on building a Masterdata object, please refer to the How to build a Masterdata object? tutorial.
  • For a comprehensive guide on all available properties and detailed explanations for defining a Masterdata object, please refer to the Masterdata Detailed Guide.