Initialize Your First Project
This guide will help you set up and run your first Stream2Vault (S2V) project.
Before you start, please ensure:
- You have the Stream2Vault client installed (see Installation).
- You have met all Prerequisites.
- You have logged in via
s2v login -c <AUTHENTICATION.JSON>
Step 1: Get Your Project Files
The easiest way to start is by using a pre-configured sample project. Alternatively, you can create the minimal files manually.
Option A: Use Sample Data (Recommended for Beginners)
- Download Sample Project:
- Download the Default Project from Sample Data page.
- Extract the Project:
- Extract the contents of the downloaded ZIP file into a new folder on your computer. Let's call this folder
my_first_s2v_project/
. - Inside
my_first_s2v_project/
, you should find a subfolder, often nameddv_model/
or similar, containing all the necessary YAML files and configuration. For this guide, we'll assume it's nameddv_model/
.
- Extract the contents of the downloaded ZIP file into a new folder on your computer. Let's call this folder
Option B: Create a Minimal Project Manually
If you prefer to start from scratch:
-
Create Project Folders:
- Create a main project folder, for example,
my_first_s2v_project/
. - Inside
my_first_s2v_project/
, create a subfolder nameddv_model/
. This will be your input model directory.
- Create a main project folder, for example,
-
Create Essential Configuration Files inside
dv_model/
:data_vault_settings.yaml
: Defines global settings for your Data Vault generation. See Data Vault Settings.source_system_settings.yaml
: Defines settings for your source systems. See Source System Settings.information_schema.csv
: Contains metadata about your source tables (columns, data types). S2V uses this to validate your model definitions against actual source structures. If you are not using sample data, you might need to manually create or obtain the information schema. See Information Schema.
-
Create a Simple Data Vault Object File:
- You can create your own Hub definition file (e.g.,
hub_customer.yaml
) in thedv_model/
directory using the template below. - Alternatively, follow the Build a Hub Tutorial for a guided example.
- Ensure the source table and columns you reference here exist in your
information_schema.csv
.
- You can create your own Hub definition file (e.g.,
# dv_model/hub.yaml
name: '<HUB_NAME>'
entity_type: 'hub'
enable_refresh: true
concatenate_business_keys: false
requires_bussines_key: false
target_business_key_columns:
- '<BUSINESS_KEY_NAME>'
entity_sources:
- urn:s2v:hub_source:src_customers:
entity_source: '(DATABASE_NAME, SCHEMA_NAME, TABLE_NAME)'
source_system_configuration_urn: 'urn:s2v:source_setting:<SOURCE_SYSTEM_NAME>'
business_key_mapping:
- <BUSINESS_KEY_NAME>:
- '<SOURCE_COLUMN_NAME'
source_business_key: ''
Step 2: Understand the Default Project Structure
By default, S2V expects all your configuration YAMLs files (from the Step 2) to be in the root directory of your project (e.g., my_first_s2v_project/
). All other YAMLs as Data Vault object YAMLs might be stored anywhere in the project directory. For more information, refer to the Project Structure Tutorial .
Example Project Structure
my_first_s2v_project/ # This is your input project directory (-i parameter)
├── data_vault_settings.yaml
├── source_system_settings.yaml
├── information_schema.csv
└── dv_model/
├── hub_customer.yaml # Example DV object file
└── ... # Other DV object YAML files (links, satellites, etc.)