Skip to content

ELT Layer Architecture - Best Practices & Recommendations

Date: 2026-01-05 Status: Recommendation Purpose: Define optimal ELT layer architecture based on Snowflake + dbt best practices


Analysis: Required Steps & Optimal Placement

Your Required Pipeline Steps:

  1. ✅ Copy to Snowflake
  2. ✅ Merge into historical table with record metadata
  3. ✅ Add XO employee metadata
  4. ✅ Add client-specific metadata (glossaries)
  5. ✅ Create SOT (Source of Truth) KPI table
  6. ✅ Create multiple views for consumption (analysts + Tableau)

❌ Problems with Current Approach

Issue 1: Redundant Table Prefixes

Current:

SILVER.SILVER_WARBYPARKER_CONTACT_TIMESTAMPS   Redundant prefix
GOLD.DM_SCORECARD_GPP                          Unnecessary DM_ prefix

Better:

SILVER.CONTACT_TIMESTAMPS                      Schema name provides context
GOLD.AGENT_KPIS_DAILY                          Descriptive, no prefix needed

Principle: Schema name already indicates layer - don't repeat it in table name.


Issue 2: Mixing Concerns in Silver

Current plan (from earlier discussion): - Silver does: Historical preservation + Employee enrichment + Glossary enrichment + Filtering

Problem: This mixes multiple concerns in one layer - Historical preservation (append records) - Business enrichment (JOINs with rosters/glossaries) - Filtering logic (XO agents only)

Result: - Complex Silver models - Expensive joins on every incremental run - Hard to maintain


Issue 3: Unclear "Source of Truth"

What is the SOT for analysts? - Silver historical table? (has all records but no enrichment) - Gold KPI table? (aggregated, lost granularity) - Something in between?

Best practice: SOT should be the most granular enriched table - all records with all metadata.


Layer Philosophy

Bronze: Immutable raw data (source of record) Silver: Historical preservation (source of history) Gold: Enriched + aggregated data (source of truth for business)

Key Principle: Each layer has a single clear purpose.


Optimal Layer Design

LAYER 1: BRONZE (Raw Landing)

Purpose: Immutable raw source data exactly as received

Who creates it: xo-foundry (Airflow)

Characteristics: - ✅ Truncated daily (idempotent loads) - ✅ All VARCHAR (no type conversion) - ✅ Metadata added: RECORD_KEY, RECORD_HASH, DATE_TO_WAREHOUSE - ✅ NO transformations - ✅ NO filtering - ✅ Temporary landing zone (doesn't preserve history)

Naming convention: {SOURCE}_{OBJECT}

Example:

WBP_DB.BRONZE.GLADLY_CONTACT_TIMESTAMPS
WBP_DB.BRONZE.GLADLY_CONVERSATION_TIMESTAMPS
WBP_DB.BRONZE.SPROUT_MESSAGES

✅ This is already correct - no changes needed.


LAYER 2: SILVER (Historical Preservation)

Purpose: Immutable historical record of all source data

Who creates it: dbt

What it does: - ✅ Preserve full history (append-only or SCD Type 2) - ✅ Type conversion (VARCHAR → proper types) - ✅ Column standardization (uppercase, underscores) - ✅ Deduplication (RECORD_KEY) - ⚠️ NO enrichment (no JOINs with other tables) - ⚠️ NO filtering (preserve all records) - ⚠️ NO aggregation (keep grain identical to Bronze)

Why no enrichment? - Performance: Joins are expensive on large fact tables - Maintainability: If roster/glossary structure changes, don't reprocess history - Separation of concerns: Silver = preservation, Gold = enrichment

Naming convention: {OBJECT}_HISTORY or just {OBJECT}

Materialization: incremental (append strategy)

Example:

-- Simple, fast, minimal transformation
WBP_DB.SILVER.CONTACT_TIMESTAMPS
  - All Bronze columns (properly typed)
  - Preserved metadata columns
  - DBT audit columns
  - NO employee metadata (that's Gold's job)
  - NO glossary metadata (that's Gold's job)
  - NO filtering (preserve everything)

Silver Model Pattern:

-- models/silver/warbyparker/contact_timestamps.sql

WITH bronze_source AS (
    SELECT * FROM {{ source('bronze_warbyparker', 'GLADLY_CONTACT_TIMESTAMPS') }}
),

typed_and_cleaned AS (
    SELECT
        -- Type conversions
        CAST(CONTACT_ID AS VARCHAR) AS CONTACT_ID,
        CAST(CONVERSATION_ID AS VARCHAR) AS CONVERSATION_ID,
        TO_TIMESTAMP_NTZ(TIMESTAMP) AS TIMESTAMP,

        -- All other columns (type-converted)
        AGENT_NAME,
        ASSIGNED_INBOX_ID,
        ASSIGNED_INBOX_NAME,
        CHANNEL,
        -- ... all Bronze columns

        -- Preserve metadata
        RECORD_KEY,
        RECORD_HASH,
        DATE_TO_WAREHOUSE,
        SOURCE_FILE,
        BATCH_ID,
        PIPELINE_RUN_ID,

        -- Add dbt audit
        CURRENT_TIMESTAMP() AS DBT_LOADED_AT,
        '{{ invocation_id }}' AS DBT_INVOCATION_ID

    FROM bronze_source
)

{% if is_incremental() %}
-- Only append new records
, new_records AS (
    SELECT typed.*
    FROM typed_and_cleaned typed
    LEFT JOIN {{ this }} silver
        ON typed.RECORD_KEY = silver.RECORD_KEY
    WHERE silver.RECORD_KEY IS NULL
)
SELECT * FROM new_records
{% else %}
SELECT * FROM typed_and_cleaned
{% endif %}

Benefits: - ✅ Fast incremental runs (no expensive joins) - ✅ Immutable history (never changes) - ✅ Simple to maintain - ✅ Can reprocess Gold without touching Silver


LAYER 3A: GOLD - Core Entities (Enriched)

Purpose: Denormalized business entities with all context (SOT for analysts)

Who creates it: dbt

What it does: - ✅ JOIN with employee rosters - ✅ JOIN with client glossaries - ✅ Filter to XO agents only - ✅ Latest state or full history (depending on use case) - ✅ Wide denormalized tables (everything analysts need)

Why denormalize here? - Storage is cheap, compute is expensive - Rosters and glossaries are relatively static - Analysts query this layer - make it fast - Pre-joined is better than join-at-query-time

Naming convention: {OBJECT} or {OBJECT}_ENRICHED

Materialization: - table (if keeping latest only) - incremental (if keeping full history with enrichment)

Example Tables:

WBP_DB.GOLD.CONTACTS
  - All Silver columns
  - + XO employee metadata (AGENT_ID, EMAIL, ROLE, SITE, LEADER)
  - + Client glossary metadata (INBOX_CATEGORY, QUEUE_TYPE)
  - Filtered to XO agents only
  - Latest state or full history

WBP_DB.GOLD.CONVERSATIONS
  - All Silver columns
  - + XO employee metadata
  - + Client glossary metadata
  - Filtered to XO agents only

Gold Core Model Pattern (Latest State):

-- models/gold/warbyparker/contacts.sql

{{
    config(
        materialized='table',  -- Rebuilt daily with latest
        schema='gold'
    )
}}

WITH silver_history AS (
    SELECT * FROM {{ ref('contact_timestamps') }}
),

-- Latest record per contact (if needed)
latest_contacts AS (
    SELECT *
    FROM silver_history
    QUALIFY ROW_NUMBER() OVER (
        PARTITION BY CONTACT_ID
        ORDER BY TIMESTAMP DESC
    ) = 1
),

-- Employee roster (from CORE_DB)
employee_roster AS (
    SELECT
        GLADLY_NAME,
        EID AS AGENT_ID,
        FULL_NAME AS AGENT_FULL_NAME,
        XO_EMAIL AS AGENT_EMAIL,
        ACCOUNT_EMAIL AS AGENT_ACCOUNT_EMAIL,
        ROLE AS AGENT_ROLE,
        SITE AS AGENT_SITE,
        DIRECT_LEADER AS AGENT_LEADER,
        STATUS AS AGENT_STATUS
    FROM {{ source('core_rosters', 'ROSTER_WARBYPARKER') }}
    WHERE STATUS = 'Active'  -- or include terminated based on needs
),

-- Client glossaries
inbox_glossary AS (
    SELECT
        INBOX_ID,
        INBOX_NAME,
        INBOX_CATEGORY,
        INBOX_TYPE
    FROM {{ source('core_glossaries', 'WARBYPARKER_INBOX_GLOSSARY') }}
),

queue_glossary AS (
    SELECT
        QUEUE_ID,
        QUEUE_NAME,
        QUEUE_CATEGORY,
        QUEUE_TYPE
    FROM {{ source('core_glossaries', 'WARBYPARKER_QUEUE_GLOSSARY') }}
),

-- Enriched contacts (SOT for analysts)
enriched AS (
    SELECT
        -- Original Silver columns
        contacts.*,

        -- Employee metadata (XO agents)
        roster.AGENT_ID,
        roster.AGENT_FULL_NAME,
        roster.AGENT_EMAIL,
        roster.AGENT_ROLE,
        roster.AGENT_SITE,
        roster.AGENT_LEADER,
        roster.AGENT_STATUS,

        -- Glossary metadata
        inbox.INBOX_CATEGORY,
        inbox.INBOX_TYPE,
        queue.QUEUE_CATEGORY,
        queue.QUEUE_TYPE,

        -- Derived flags
        CASE
            WHEN roster.AGENT_EMAIL LIKE 'xo.%@%' THEN TRUE
            ELSE FALSE
        END AS IS_XO_AGENT,

        -- dbt metadata
        CURRENT_TIMESTAMP() AS DBT_UPDATED_AT

    FROM latest_contacts contacts
    LEFT JOIN employee_roster roster
        ON contacts.ASSIGNED_AGENT_NAME = roster.GLADLY_NAME
    LEFT JOIN inbox_glossary inbox
        ON contacts.ASSIGNED_INBOX_ID = inbox.INBOX_ID
    LEFT JOIN queue_glossary queue
        ON contacts.QUEUE_ID = queue.QUEUE_ID  -- adjust based on actual schema
)

-- Filter to XO agents only
SELECT *
FROM enriched
WHERE IS_XO_AGENT = TRUE

Benefits: - ✅ Analysts have everything in one table - ✅ No joins needed at query time - ✅ Fast queries - ✅ Latest state is clear - ✅ Can rebuild daily without touching Silver


LAYER 3B: GOLD - Metrics (Aggregated KPIs)

Purpose: Pre-aggregated business metrics for fast dashboard queries

Who creates it: dbt

What it does: - ✅ Aggregate from Gold enriched tables - ✅ Common grains: daily, hourly, by agent, by channel - ✅ Standardized KPI schema across clients - ✅ Client-specific metrics as additional columns

Naming convention: {METRIC}_{GRAIN}

Materialization: table or incremental

Example Tables:

WBP_DB.GOLD.AGENT_KPIS_DAILY
  - Aggregated by: KPI_DATE, AGENT_ID
  - Standard KPIs: AVG_HANDLE_TIME, CONTACT_COUNT, AVG_CSAT, etc.
  - Client-specific KPIs: CHAT_COUNT, IVR_ENTRY_COUNT, etc.

WBP_DB.GOLD.CHANNEL_METRICS_HOURLY
  - Aggregated by: HOUR, CHANNEL
  - Metrics: VOLUME, AVG_AHT, etc.

WBP_DB.GOLD.QUEUE_PERFORMANCE_DAILY
  - Aggregated by: DATE, QUEUE_ID
  - Metrics: SLA%, AVG_WAIT_TIME, etc.

Gold Metrics Model Pattern:

-- models/gold/warbyparker/agent_kpis_daily.sql

{{
    config(
        materialized='table',  -- Rebuild daily
        schema='gold'
    )
}}

WITH enriched_contacts AS (
    SELECT * FROM {{ ref('contacts') }}
),

daily_agent_metrics AS (
    SELECT
        -- Dimensions
        DATE(TIMESTAMP) AS KPI_DATE,
        AGENT_ID,
        AGENT_EMAIL,
        AGENT_FULL_NAME,
        AGENT_ROLE,
        AGENT_SITE,

        -- ==========================================
        -- STANDARD KPIs (ALL CLIENTS MUST HAVE)
        -- ==========================================

        -- Volume metrics
        COUNT(DISTINCT CONTACT_ID) AS CONTACT_COUNT,
        COUNT(DISTINCT CONVERSATION_ID) AS CONVERSATION_COUNT,
        COUNT(DISTINCT CUSTOMER_ID) AS UNIQUE_CUSTOMER_COUNT,

        -- Handle time metrics
        AVG(HANDLE_TIME_SECONDS) AS AVG_HANDLE_TIME,
        SUM(HANDLE_TIME_SECONDS) AS TOTAL_HANDLE_TIME,
        MIN(HANDLE_TIME_SECONDS) AS MIN_HANDLE_TIME,
        MAX(HANDLE_TIME_SECONDS) AS MAX_HANDLE_TIME,

        -- Hold metrics
        AVG(HOLD_TIME_SECONDS) AS AVG_HOLD_TIME,
        SUM(HOLD_TIME_SECONDS) AS TOTAL_HOLD_TIME,
        COUNT(CASE WHEN HOLD_TIME_SECONDS > 0 THEN 1 END) AS CONTACTS_WITH_HOLD,

        -- AHT
        AVG(AHT_SECONDS) AS AVG_AHT,

        -- Quality scores (if available)
        AVG(CSAT_SCORE) AS AVG_CSAT,
        AVG(QA_SCORE) AS AVG_QA_SCORE,

        -- Response time
        AVG(FIRST_RESPONSE_TIME_SECONDS) AS AVG_FIRST_RESPONSE_TIME,

        -- ==========================================
        -- CLIENT-SPECIFIC KPIs (Warby Parker)
        -- ==========================================

        -- Channel breakdown
        COUNT(CASE WHEN CHANNEL = 'CHAT' THEN 1 END) AS CHAT_COUNT,
        COUNT(CASE WHEN CHANNEL = 'EMAIL' THEN 1 END) AS EMAIL_COUNT,
        COUNT(CASE WHEN CHANNEL = 'VOICE' THEN 1 END) AS VOICE_COUNT,
        COUNT(CASE WHEN CHANNEL = 'SMS' THEN 1 END) AS SMS_COUNT,

        -- Entry points
        COUNT(CASE WHEN ENTRY_POINT = 'IVR' THEN 1 END) AS IVR_ENTRY_COUNT,
        COUNT(CASE WHEN ENTRY_POINT = 'WEB' THEN 1 END) AS WEB_ENTRY_COUNT,

        -- Payment metrics
        AVG(CASE WHEN PAYMENT_AMOUNT IS NOT NULL THEN PAYMENT_AMOUNT END) AS AVG_PAYMENT_AMOUNT,
        SUM(CASE WHEN PAYMENT_AMOUNT IS NOT NULL THEN PAYMENT_AMOUNT END) AS TOTAL_PAYMENT_AMOUNT,
        COUNT(CASE WHEN PAYMENT_AMOUNT > 0 THEN 1 END) AS TRANSACTIONS_COUNT,

        -- Inbox categories (from glossary enrichment)
        COUNT(CASE WHEN INBOX_CATEGORY = 'SALES' THEN 1 END) AS SALES_CONTACTS,
        COUNT(CASE WHEN INBOX_CATEGORY = 'SUPPORT' THEN 1 END) AS SUPPORT_CONTACTS,
        COUNT(CASE WHEN INBOX_CATEGORY = 'RETURNS' THEN 1 END) AS RETURNS_CONTACTS,

        -- Metadata
        CURRENT_TIMESTAMP() AS DBT_UPDATED_AT

    FROM enriched_contacts
    GROUP BY 1, 2, 3, 4, 5, 6
)

SELECT * FROM daily_agent_metrics

Standard KPI Schema (for ALL clients):

-- Define this schema once, enforce across all clients

STANDARD_COLUMNS (required for every client):
  - KPI_DATE
  - AGENT_ID, AGENT_EMAIL, AGENT_FULL_NAME
  - AGENT_ROLE, AGENT_SITE
  - CONTACT_COUNT
  - CONVERSATION_COUNT
  - UNIQUE_CUSTOMER_COUNT
  - AVG_HANDLE_TIME, TOTAL_HANDLE_TIME
  - AVG_HOLD_TIME, TOTAL_HOLD_TIME
  - AVG_AHT
  - AVG_CSAT, AVG_QA_SCORE
  - AVG_FIRST_RESPONSE_TIME

CLIENT_SPECIFIC_COLUMNS (vary by client):
  - Warby Parker: CHAT_COUNT, IVR_ENTRY_COUNT, AVG_PAYMENT_AMOUNT
  - Condé Nast: TOPIC_COUNT, ARTICLE_LINKS_SENT
  - Keller Postman: CASE_TYPE_COUNT, LEGAL_CONSULTATIONS


LAYER 3C: GOLD - Views (Consumption)

Purpose: Specific views for different consumption patterns

Who creates it: dbt

What it does: - ✅ Tableau-specific views (denormalized, optimized for dashboards) - ✅ Ad-hoc analyst views (with helpful calculations pre-computed) - ✅ Executive summary views (high-level aggregations)

Naming convention: {PURPOSE}

Materialization: view (not table - no storage cost)

Example Views:

WBP_DB.GOLD.AGENT_DASHBOARD          -- Tableau dashboard view
WBP_DB.GOLD.AGENT_AD_HOC             -- Analyst exploration view
WBP_DB.GOLD.EXECUTIVE_SUMMARY        -- Executive rollup view
WBP_DB.GOLD.AGENT_PERFORMANCE        -- Manager performance view

Gold View Pattern (Tableau Dashboard):

-- models/gold/warbyparker/agent_dashboard.sql

{{
    config(
        materialized='view',  -- Not a table, just a view
        schema='gold'
    )
}}

WITH agent_kpis AS (
    SELECT * FROM {{ ref('agent_kpis_daily') }}
),

-- Current roster (for latest employee info)
current_roster AS (
    SELECT
        EID AS AGENT_ID,
        FULL_NAME,
        ROLE,
        SITE,
        DIRECT_LEADER,
        STATUS,
        HIRE_DATE,
        START_DATE_IN_ACCOUNT
    FROM {{ source('core_rosters', 'ROSTER_WARBYPARKER') }}
),

-- Performance targets (if available)
targets AS (
    SELECT
        TARGET_DATE,
        AHT_TARGET,
        CSAT_TARGET,
        QA_TARGET,
        HANDLE_TIME_TARGET
    FROM {{ source('core_targets', 'WARBYPARKER_TARGETS') }}
),

dashboard AS (
    SELECT
        -- KPI data
        kpis.KPI_DATE,
        kpis.AGENT_ID,
        kpis.AGENT_EMAIL,
        kpis.AGENT_FULL_NAME,

        -- Current roster info (may differ from historical)
        roster.ROLE AS CURRENT_ROLE,
        roster.SITE AS CURRENT_SITE,
        roster.DIRECT_LEADER,
        roster.STATUS AS CURRENT_STATUS,
        roster.HIRE_DATE,
        DATEDIFF(day, roster.START_DATE_IN_ACCOUNT, kpis.KPI_DATE) AS DAYS_IN_ACCOUNT,

        -- Volume metrics
        kpis.CONTACT_COUNT,
        kpis.CONVERSATION_COUNT,
        kpis.UNIQUE_CUSTOMER_COUNT,

        -- Performance metrics
        kpis.AVG_HANDLE_TIME,
        kpis.AVG_AHT,
        kpis.AVG_CSAT,
        kpis.AVG_QA_SCORE,

        -- Targets
        targets.AHT_TARGET,
        targets.CSAT_TARGET,
        targets.QA_TARGET,

        -- Variance calculations
        kpis.AVG_AHT - targets.AHT_TARGET AS AHT_VARIANCE,
        kpis.AVG_CSAT - targets.CSAT_TARGET AS CSAT_VARIANCE,
        kpis.AVG_QA_SCORE - targets.QA_TARGET AS QA_VARIANCE,

        -- Performance flags (for conditional formatting in Tableau)
        CASE
            WHEN kpis.AVG_AHT <= targets.AHT_TARGET THEN 'Meeting Target'
            WHEN kpis.AVG_AHT <= targets.AHT_TARGET * 1.1 THEN 'Close to Target'
            ELSE 'Below Target'
        END AS AHT_PERFORMANCE,

        CASE
            WHEN kpis.AVG_CSAT >= targets.CSAT_TARGET THEN 'Meeting Target'
            WHEN kpis.AVG_CSAT >= targets.CSAT_TARGET * 0.9 THEN 'Close to Target'
            ELSE 'Below Target'
        END AS CSAT_PERFORMANCE,

        -- Channel breakdown
        kpis.CHAT_COUNT,
        kpis.EMAIL_COUNT,
        kpis.VOICE_COUNT,

        -- Client-specific
        kpis.IVR_ENTRY_COUNT,
        kpis.AVG_PAYMENT_AMOUNT,
        kpis.TOTAL_PAYMENT_AMOUNT

    FROM agent_kpis kpis
    LEFT JOIN current_roster roster
        ON kpis.AGENT_ID = roster.AGENT_ID
    LEFT JOIN targets
        ON kpis.KPI_DATE = targets.TARGET_DATE
)

SELECT * FROM dashboard

Benefits: - ✅ No storage cost (views are just queries) - ✅ Always reflects latest roster/target data - ✅ Tableau-optimized (denormalized, pre-calculated) - ✅ Easy to modify without rebuilding data


Complete Data Flow

┌─────────────────────────────────────────────────────────────────┐
│ Step 1: Copy to Snowflake                                       │
│ xo-foundry → BRONZE.GLADLY_CONTACT_TIMESTAMPS                   │
│   - Raw data (all VARCHAR)                                      │
│   - Truncated daily (idempotent)                                │
│   - Metadata added (RECORD_KEY, RECORD_HASH, etc.)              │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ Step 2: Merge into historical table with record metadata        │
│ dbt → SILVER.CONTACT_TIMESTAMPS                                 │
│   - Append-only (INSERT WHERE NOT EXISTS on RECORD_KEY)         │
│   - Type conversion (VARCHAR → proper types)                    │
│   - Preserve metadata from Bronze                               │
│   - NO enrichment, NO filtering (pure preservation)             │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ Steps 3 & 4: Add XO employee + client metadata                  │
│ dbt → GOLD.CONTACTS (enriched core entity)                      │
│   - JOIN with CORE.ROSTER (employee metadata)                   │
│   - JOIN with CORE.INBOX_GLOSSARY (client metadata)             │
│   - JOIN with CORE.QUEUE_GLOSSARY (client metadata)             │
│   - Filter to XO agents only (WHERE IS_XO_AGENT = TRUE)         │
│   - Latest state or full history                                │
│   ← THIS IS THE SOURCE OF TRUTH FOR ANALYSTS                    │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ Step 5: Create SOT KPI table                                    │
│ dbt → GOLD.AGENT_KPIS_DAILY                                     │
│   - Aggregate from GOLD.CONTACTS                                │
│   - Standard KPI schema (all clients share)                     │
│   - Client-specific KPIs (additional columns)                   │
│   - Pre-aggregated for performance                              │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ Step 6: Create views for consumption                            │
│ dbt → GOLD.AGENT_DASHBOARD (view)                               │
│      GOLD.AGENT_AD_HOC (view)                                   │
│   - Tableau-specific views                                      │
│   - Ad-hoc analyst views                                        │
│   - No storage cost (just queries)                              │
└─────────────────────────────────────────────────────────────────┘

Performance Considerations

Why Denormalize in Gold?

Storage is cheap, compute is expensive in Snowflake:

Option A: Join at query time (BAD)

-- Every Tableau query runs this JOIN
SELECT
    contacts.*,
    roster.*,
    glossary.*
FROM SILVER.CONTACTS contacts
JOIN CORE.ROSTER roster ON ...
JOIN CORE.GLOSSARY glossary ON ...
WHERE ...
- ❌ Expensive joins on every query - ❌ Slow dashboard load times - ❌ High compute cost

Option B: Denormalize once in Gold (GOOD)

-- Gold table already has everything joined
SELECT * FROM GOLD.CONTACTS
WHERE ...
- ✅ Fast queries (no joins) - ✅ Low compute cost - ✅ Better user experience

Trade-off: More storage, but storage is ~10x cheaper than compute in Snowflake.


Incremental vs Full Refresh

Silver (historical preservation): - Use incremental with append strategy - Only process new records each run - Fast and efficient

Gold enriched (core entities): - Option A: table materialized, rebuild daily - Simpler logic - Always fresh with latest rosters/glossaries - Good for tables under 100M rows

  • Option B: incremental with merge logic
  • More complex
  • Faster for very large tables
  • Need to handle roster/glossary changes

Recommendation: Start with Option A (table), move to Option B if performance becomes an issue.

Gold metrics (aggregated): - Use table, rebuild daily from Gold enriched - Pre-aggregated = fast for dashboards - Can switch to incremental for very large time ranges


Naming Convention (Final Recommendation)

Drop ALL Prefixes

Principle: Schema name provides context, don't repeat it

-- ❌ OLD (redundant)
SILVER.SILVER_WARBYPARKER_CONTACT_TIMESTAMPS
SILVER.WH_CONVERSATIONS
GOLD.DM_SCORECARD_GPP

-- ✅ NEW (clean)
SILVER.CONTACT_TIMESTAMPS
SILVER.CONVERSATIONS
GOLD.AGENT_KPIS_DAILY

Full Naming Convention:

Bronze: {SOURCE}_{OBJECT}

BRONZE.GLADLY_CONTACT_TIMESTAMPS
BRONZE.GLADLY_CONVERSATION_TIMESTAMPS
BRONZE.SPROUT_MESSAGES

Silver: {OBJECT} (optionally {OBJECT}_HISTORY)

SILVER.CONTACT_TIMESTAMPS
SILVER.CONVERSATION_TIMESTAMPS
SILVER.MESSAGES

Gold Core: {OBJECT} (singular for dimension-like, plural for fact-like)

GOLD.CONTACTS              -- Enriched contact facts
GOLD.CONVERSATIONS         -- Enriched conversation facts
GOLD.AGENTS                -- Current agent dimension

Gold Metrics: {METRIC}_{GRAIN}

GOLD.AGENT_KPIS_DAILY
GOLD.CHANNEL_METRICS_HOURLY
GOLD.QUEUE_PERFORMANCE_DAILY
GOLD.EXECUTIVE_SUMMARY_WEEKLY

Gold Views: {PURPOSE}

GOLD.AGENT_DASHBOARD
GOLD.AGENT_AD_HOC
GOLD.EXECUTIVE_SUMMARY
GOLD.MANAGER_SCORECARD


Supporting Infrastructure

Employee Rosters

Location: CORE_DB (shared across clients)

Recommendation: Keep separate per client initially

CORE_DB.BRONZE.BAMBOOHR_WARBYPARKER_RAW
CORE_DB.SILVER.ROSTER_WARBYPARKER           -- Current roster
CORE_DB.SILVER.ROSTER_WARBYPARKER_HISTORY   -- SCD Type 2

CORE_DB.BRONZE.BAMBOOHR_CONDENAST_RAW
CORE_DB.SILVER.ROSTER_CONDENAST
CORE_DB.SILVER.ROSTER_CONDENAST_HISTORY

Future consolidation (when all clients standardized):

CORE_DB.SILVER.ROSTER_ALL
  - Add CLIENT_NAME column
  - Unified schema

Glossary Tables

Location: CORE_DB (shared reference data)

Naming: {CLIENT}_{TOPIC}_GLOSSARY

CORE_DB.SILVER.WARBYPARKER_INBOX_GLOSSARY
CORE_DB.SILVER.WARBYPARKER_QUEUE_GLOSSARY
CORE_DB.SILVER.WARBYPARKER_SCORECARD_GLOSSARY

CORE_DB.SILVER.CONDENAST_DISPOSITION_GLOSSARY
CORE_DB.SILVER.CONDENAST_TOPIC_GLOSSARY

Why CORE_DB? These are shared reference data, not client-specific facts.


Migration Path

Phase 1: Fix Silver (Remove Enrichment)

Current: Silver has employee + glossary joins Target: Silver is pure historical preservation

-- BEFORE (mixed concerns)
SILVER.SILVER_WARBYPARKER_CONTACT_TIMESTAMPS
  - Has employee JOINs
  - Has glossary JOINs
  - Has filtering

-- AFTER (single concern)
SILVER.CONTACT_TIMESTAMPS
  - No JOINs
  - No filtering
  - Just type conversion + deduplication

Phase 2: Create Gold Core (Enriched Entities)

New tables:

GOLD.CONTACTS              -- Enriched with all metadata
GOLD.CONVERSATIONS         -- Enriched with all metadata

These become the SOT for analysts.

Phase 3: Create Gold Metrics (KPIs)

New tables:

GOLD.AGENT_KPIS_DAILY
GOLD.CHANNEL_METRICS_HOURLY
GOLD.QUEUE_PERFORMANCE_DAILY

Define standard schema across all clients.

Phase 4: Create Gold Views (Consumption)

New views:

GOLD.AGENT_DASHBOARD       -- Tableau
GOLD.AGENT_AD_HOC          -- Analysts
GOLD.EXECUTIVE_SUMMARY     -- Executives

Phase 5: Deprecate Old Tables

  • Remove old OPERATIONS.AGGREGATE.* tables
  • Update Tableau connections to point to new Gold views

Summary of Recommendations

✅ Layer Responsibilities

Layer Purpose Enrichment? Aggregation? Filtering?
Bronze Raw landing ❌ No ❌ No ❌ No
Silver Historical preservation ❌ No ❌ No ❌ No
Gold Core Enriched entities ✅ YES ❌ No ✅ YES
Gold Metrics Aggregated KPIs N/A ✅ YES ✅ YES
Gold Views Consumption N/A Optional Optional

✅ Naming Convention

  • Drop ALL prefixes (SILVER_, WH_, DM_)
  • Schema name provides layer context
  • Use descriptive table names

✅ Performance Strategy

  • Denormalize in Gold (storage is cheap)
  • Pre-aggregate common patterns (metrics layer)
  • Use views for flexible consumption (no storage cost)

✅ Separation of Concerns

  • Silver = Immutable history (never changes after written)
  • Gold = Business logic (rebuild as rosters/glossaries change)
  • Clear boundary between layers

Next Actions

Questions to answer:

  1. Should we rename existing Silver tables to drop SILVER_ prefix?
  2. When should we implement Gold enriched layer (GOLD.CONTACTS)?
  3. What standard KPIs do ALL clients need? (define schema)
  4. Should rosters stay separate or consolidate to ROSTER_ALL?
  5. Where are glossaries sourced from (Google Sheets? Manual entry)?

Implementation priority:

  1. ✅ Bronze → Already correct
  2. ⚠️ Silver → Remove enrichment, keep simple
  3. 🎯 Gold Core → Add enrichment here (employee + glossary)
  4. 🎯 Gold Metrics → Create standard KPI schema
  5. 🎯 Gold Views → Tableau + analyst views

This architecture follows Snowflake + dbt best practices and optimizes for: - ✅ Performance (denormalize early, pre-aggregate) - ✅ Maintainability (clear separation of concerns) - ✅ Cost efficiency (storage cheap, compute expensive) - ✅ Analyst experience (wide tables, no joins needed)