Skip to main content

Frequently Asked Questions

Security

1. Who owns the GCP tenant for Stream2Vault and who is responsible for its management?

The GCP tenant hosting the Stream2Vault (S2V) application is owned and managed by reeeliance IM GmbH. The S2V Product Team within reeeliance is responsible for its management, including provisioning, monitoring, and securing resources. Access control follows the principle of least privilege, ensuring only authorized personnel can perform administrative actions.

2. How is the Stream2Vault application hosted in GCP?

Stream2Vault is hosted on Google Cloud Platform (GCP) using Google Cloud Run, a fully managed, serverless container execution platform. It runs as a containerized service with access controlled via Google IAM. The application image is stored in Google Artifact Registry, and service accounts are used for secure execution. All deployments follow CI/CD pipelines to ensure security and consistency.

3. How does communication happen between the client environment and the Stream2Vault hosting environment?

Communication between the client environment and the Stream2Vault service happens via the S2V client, which runs on users' machines. The client connects to the Stream2Vault server, which is deployed on Google Cloud Run. All communication is secured using HTTPS and authenticated via OAuth 2.0 using Azure AD. Users authenticate with their Azure AD credentials, and API access is authorized using OAuth tokens. IAM policies further enforce access control. No direct network connectivity between the client environment and the hosting environment is required—users initiate connections via the client, which interacts with the service securely over the internet.

4. Where and how are logs stored? Who has access to these logs?

Application logs are collected via Google Cloud Logging. This includes:

  • Application logs (API requests, processing events)
  • Security logs (authentication, access control events)
  • Audit logs (changes to infrastructure, configurations)

In addition, authentication logs should also be recorded in the client's Azure AD Sign-in Logs and Audit Logs. Access to logs is restricted to authorized administrators and security teams as per Google IAM and Azure AD policies. Stream2Vault logs are retained indefinitely.

5. Are there any regulatory compliance considerations for Stream2Vault?

Stream2Vault does not collect or store any PII, business data, or customer information.

  • No personally identifiable information (PII) is collected by the service or the client.
  • No business data is collected or stored. SQL code generated or configuration files sent to the service for validation are processed entirely in-memory and are not retained.
  • No database credentials are exposed to the S2V service. The S2V service has no access to any client infrastructure.

In terms of regulatory compliance, Stream2Vault aligns with SOC 2, ISO 27001, and GDPR best practices by:

  • Enforcing federated authentication via Azure AD
  • Restricting API access using OAuth 2.0 tokens
  • Logging authentication and access attempts for auditability
  • Preventing unauthorized access using IAM role-based policies

Modelling

How should multi-active satellites be modeled?

Modeling multi-active satellites requires careful consideration of the specific use case. Here are a few approaches:

  1. Verify Necessity: First, confirm if a multi-active satellite is genuinely required. Often, what appears to be a multi-active scenario can be addressed by a standard Satellite if the driving key correctly represents the grain.
  2. Revisit Model Design: Review your overall Data Vault model. Sometimes, adjustments to existing Hubs, Links, or the introduction of new ones can resolve the need for a multi-active satellite.
  3. Introduce Technical Hubs: In some cases, creating a technical Hub to represent the unique combination of keys that define the multi-active relationship can be a solution. This new Hub would then be parent to the Satellite.

The best approach depends on the specific scenario. We recommend consulting Data Vault best practices or seeking expert advice for complex situations.

How can a Link be modeled if it uses the same source table multiple times for different relationships?

When a Link needs to connect to the same source table to represent different roles or relationships (e.g., an Employee table for both an employee and their manager), you should not list the same source multiple times under entity_sources in the Link's YAML. Instead, treat each role as a distinct connection:

  1. In the Link's connected_hubs section, reference the same Hub multiple times but use different aliases for each instance. For example
  2. In the entity_sources section, map the source columns to the respective aliased Hub's business keys.

If this aliasing approach doesn't fit, it might indicate a need to revisit the Link's design or the surrounding Hub structures.