ELT Layer Architecture - Best Practices & Recommendations¶

Date: 2026-01-05 Status: Recommendation Purpose: Define optimal ELT layer architecture based on Snowflake + dbt best practices

Analysis: Required Steps & Optimal Placement¶

Your Required Pipeline Steps:¶

✅ Copy to Snowflake
✅ Merge into historical table with record metadata
✅ Add XO employee metadata
✅ Add client-specific metadata (glossaries)
✅ Create SOT (Source of Truth) KPI table
✅ Create multiple views for consumption (analysts + Tableau)

❌ Problems with Current Approach¶

Issue 1: Redundant Table Prefixes¶

Current:

SILVER.SILVER_WARBYPARKER_CONTACT_TIMESTAMPS  ❌ Redundant prefix
GOLD.DM_SCORECARD_GPP                         ❌ Unnecessary DM_ prefix

Better:

SILVER.CONTACT_TIMESTAMPS                     ✅ Schema name provides context
GOLD.AGENT_KPIS_DAILY                         ✅ Descriptive, no prefix needed

Principle: Schema name already indicates layer - don't repeat it in table name.

Issue 2: Mixing Concerns in Silver¶

Current plan (from earlier discussion): - Silver does: Historical preservation + Employee enrichment + Glossary enrichment + Filtering

Problem: This mixes multiple concerns in one layer - Historical preservation (append records) - Business enrichment (JOINs with rosters/glossaries) - Filtering logic (XO agents only)

Result: - Complex Silver models - Expensive joins on every incremental run - Hard to maintain

Issue 3: Unclear "Source of Truth"¶

What is the SOT for analysts? - Silver historical table? (has all records but no enrichment) - Gold KPI table? (aggregated, lost granularity) - Something in between?

Best practice: SOT should be the most granular enriched table - all records with all metadata.

✅ Recommended Architecture (Best Practices)¶

Layer Philosophy¶

Bronze: Immutable raw data (source of record) Silver: Historical preservation (source of history) Gold: Enriched + aggregated data (source of truth for business)

Key Principle: Each layer has a single clear purpose.

Optimal Layer Design¶

LAYER 1: BRONZE (Raw Landing)¶

Purpose: Immutable raw source data exactly as received

Who creates it: xo-foundry (Airflow)

Characteristics: - ✅ Truncated daily (idempotent loads) - ✅ All VARCHAR (no type conversion) - ✅ Metadata added: RECORD_KEY, RECORD_HASH, DATE_TO_WAREHOUSE - ✅ NO transformations - ✅ NO filtering - ✅ Temporary landing zone (doesn't preserve history)

Naming convention: {SOURCE}_{OBJECT}

Example:

WBP_DB.BRONZE.GLADLY_CONTACT_TIMESTAMPS
WBP_DB.BRONZE.GLADLY_CONVERSATION_TIMESTAMPS
WBP_DB.BRONZE.SPROUT_MESSAGES

✅ This is already correct - no changes needed.

LAYER 2: SILVER (Historical Preservation)¶

Purpose: Immutable historical record of all source data

Who creates it: dbt

What it does: - ✅ Preserve full history (append-only or SCD Type 2) - ✅ Type conversion (VARCHAR → proper types) - ✅ Column standardization (uppercase, underscores) - ✅ Deduplication (RECORD_KEY) - ⚠️ NO enrichment (no JOINs with other tables) - ⚠️ NO filtering (preserve all records) - ⚠️ NO aggregation (keep grain identical to Bronze)

Why no enrichment? - Performance: Joins are expensive on large fact tables - Maintainability: If roster/glossary structure changes, don't reprocess history - Separation of concerns: Silver = preservation, Gold = enrichment

Naming convention: {OBJECT}_HISTORY or just {OBJECT}

Materialization: incremental (append strategy)

Example:

-- Simple, fast, minimal transformation
WBP_DB.SILVER.CONTACT_TIMESTAMPS
  - All Bronze columns (properly typed)
  - Preserved metadata columns
  - DBT audit columns
  - NO employee metadata (that's Gold's job)
  - NO glossary metadata (that's Gold's job)
  - NO filtering (preserve everything)

Silver Model Pattern:

-- models/silver/warbyparker/contact_timestamps.sql

WITH bronze_source AS (
    SELECT * FROM {{ source('bronze_warbyparker', 'GLADLY_CONTACT_TIMESTAMPS') }}
),

typed_and_cleaned AS (
    SELECT
        -- Type conversions
        CAST(CONTACT_ID AS VARCHAR) AS CONTACT_ID,
        CAST(CONVERSATION_ID AS VARCHAR) AS CONVERSATION_ID,
        TO_TIMESTAMP_NTZ(TIMESTAMP) AS TIMESTAMP,

        -- All other columns (type-converted)
        AGENT_NAME,
        ASSIGNED_INBOX_ID,
        ASSIGNED_INBOX_NAME,
        CHANNEL,
        -- ... all Bronze columns

        -- Preserve metadata
        RECORD_KEY,
        RECORD_HASH,
        DATE_TO_WAREHOUSE,
        SOURCE_FILE,
        BATCH_ID,
        PIPELINE_RUN_ID,

        -- Add dbt audit
        CURRENT_TIMESTAMP() AS DBT_LOADED_AT,
        '{{ invocation_id }}' AS DBT_INVOCATION_ID

    FROM bronze_source
)

{% if is_incremental() %}
-- Only append new records
, new_records AS (
    SELECT typed.*
    FROM typed_and_cleaned typed
    LEFT JOIN {{ this }} silver
        ON typed.RECORD_KEY = silver.RECORD_KEY
    WHERE silver.RECORD_KEY IS NULL
)
SELECT * FROM new_records
{% else %}
SELECT * FROM typed_and_cleaned
{% endif %}

Benefits: - ✅ Fast incremental runs (no expensive joins) - ✅ Immutable history (never changes) - ✅ Simple to maintain - ✅ Can reprocess Gold without touching Silver

LAYER 3A: GOLD - Core Entities (Enriched)¶

Purpose: Denormalized business entities with all context (SOT for analysts)

Who creates it: dbt

What it does: - ✅ JOIN with employee rosters - ✅ JOIN with client glossaries - ✅ Filter to XO agents only - ✅ Latest state or full history (depending on use case) - ✅ Wide denormalized tables (everything analysts need)

Why denormalize here? - Storage is cheap, compute is expensive - Rosters and glossaries are relatively static - Analysts query this layer - make it fast - Pre-joined is better than join-at-query-time

Naming convention: {OBJECT} or {OBJECT}_ENRICHED

Materialization: - table (if keeping latest only) - incremental (if keeping full history with enrichment)

Example Tables:

WBP_DB.GOLD.CONTACTS
  - All Silver columns
  - + XO employee metadata (AGENT_ID, EMAIL, ROLE, SITE, LEADER)
  - + Client glossary metadata (INBOX_CATEGORY, QUEUE_TYPE)
  - Filtered to XO agents only
  - Latest state or full history

WBP_DB.GOLD.CONVERSATIONS
  - All Silver columns
  - + XO employee metadata
  - + Client glossary metadata
  - Filtered to XO agents only

Gold Core Model Pattern (Latest State):

-- models/gold/warbyparker/contacts.sql

{{
    config(
        materialized='table',  -- Rebuilt daily with latest
        schema='gold'
    )
}}

WITH silver_history AS (
    SELECT * FROM {{ ref('contact_timestamps') }}
),

-- Latest record per contact (if needed)
latest_contacts AS (
    SELECT *
    FROM silver_history
    QUALIFY ROW_NUMBER() OVER (
        PARTITION BY CONTACT_ID
        ORDER BY TIMESTAMP DESC
    ) = 1
),

-- Employee roster (from CORE_DB)
employee_roster AS (
    SELECT
        GLADLY_NAME,
        EID AS AGENT_ID,
        FULL_NAME AS AGENT_FULL_NAME,
        XO_EMAIL AS AGENT_EMAIL,
        ACCOUNT_EMAIL AS AGENT_ACCOUNT_EMAIL,
        ROLE AS AGENT_ROLE,
        SITE AS AGENT_SITE,
        DIRECT_LEADER AS AGENT_LEADER,
        STATUS AS AGENT_STATUS
    FROM {{ source('core_rosters', 'ROSTER_WARBYPARKER') }}
    WHERE STATUS = 'Active'  -- or include terminated based on needs
),

-- Client glossaries
inbox_glossary AS (
    SELECT
        INBOX_ID,
        INBOX_NAME,
        INBOX_CATEGORY,
        INBOX_TYPE
    FROM {{ source('core_glossaries', 'WARBYPARKER_INBOX_GLOSSARY') }}
),

queue_glossary AS (
    SELECT
        QUEUE_ID,
        QUEUE_NAME,
        QUEUE_CATEGORY,
        QUEUE_TYPE
    FROM {{ source('core_glossaries', 'WARBYPARKER_QUEUE_GLOSSARY') }}
),

-- Enriched contacts (SOT for analysts)
enriched AS (
    SELECT
        -- Original Silver columns
        contacts.*,

        -- Employee metadata (XO agents)
        roster.AGENT_ID,
        roster.AGENT_FULL_NAME,
        roster.AGENT_EMAIL,
        roster.AGENT_ROLE,
        roster.AGENT_SITE,
        roster.AGENT_LEADER,
        roster.AGENT_STATUS,

        -- Glossary metadata
        inbox.INBOX_CATEGORY,
        inbox.INBOX_TYPE,
        queue.QUEUE_CATEGORY,
        queue.QUEUE_TYPE,

        -- Derived flags
        CASE
            WHEN roster.AGENT_EMAIL LIKE 'xo.%@%' THEN TRUE
            ELSE FALSE
        END AS IS_XO_AGENT,

        -- dbt metadata
        CURRENT_TIMESTAMP() AS DBT_UPDATED_AT

    FROM latest_contacts contacts
    LEFT JOIN employee_roster roster
        ON contacts.ASSIGNED_AGENT_NAME = roster.GLADLY_NAME
    LEFT JOIN inbox_glossary inbox
        ON contacts.ASSIGNED_INBOX_ID = inbox.INBOX_ID
    LEFT JOIN queue_glossary queue
        ON contacts.QUEUE_ID = queue.QUEUE_ID  -- adjust based on actual schema
)

-- Filter to XO agents only
SELECT *
FROM enriched
WHERE IS_XO_AGENT = TRUE

Benefits: - ✅ Analysts have everything in one table - ✅ No joins needed at query time - ✅ Fast queries - ✅ Latest state is clear - ✅ Can rebuild daily without touching Silver

LAYER 3B: GOLD - Metrics (Aggregated KPIs)¶

Purpose: Pre-aggregated business metrics for fast dashboard queries

Who creates it: dbt

What it does: - ✅ Aggregate from Gold enriched tables - ✅ Common grains: daily, hourly, by agent, by channel - ✅ Standardized KPI schema across clients - ✅ Client-specific metrics as additional columns

Naming convention: {METRIC}_{GRAIN}

Materialization: table or incremental

Example Tables:

WBP_DB.GOLD.AGENT_KPIS_DAILY
  - Aggregated by: KPI_DATE, AGENT_ID
  - Standard KPIs: AVG_HANDLE_TIME, CONTACT_COUNT, AVG_CSAT, etc.
  - Client-specific KPIs: CHAT_COUNT, IVR_ENTRY_COUNT, etc.

WBP_DB.GOLD.CHANNEL_METRICS_HOURLY
  - Aggregated by: HOUR, CHANNEL
  - Metrics: VOLUME, AVG_AHT, etc.

WBP_DB.GOLD.QUEUE_PERFORMANCE_DAILY
  - Aggregated by: DATE, QUEUE_ID
  - Metrics: SLA%, AVG_WAIT_TIME, etc.

Gold Metrics Model Pattern:

-- models/gold/warbyparker/agent_kpis_daily.sql

{{
    config(
        materialized='table',  -- Rebuild daily
        schema='gold'
    )
}}

WITH enriched_contacts AS (
    SELECT * FROM {{ ref('contacts') }}
),

daily_agent_metrics AS (
    SELECT
        -- Dimensions
        DATE(TIMESTAMP) AS KPI_DATE,
        AGENT_ID,
        AGENT_EMAIL,
        AGENT_FULL_NAME,
        AGENT_ROLE,
        AGENT_SITE,

        -- ==========================================
        -- STANDARD KPIs (ALL CLIENTS MUST HAVE)
        -- ==========================================

        -- Volume metrics
        COUNT(DISTINCT CONTACT_ID) AS CONTACT_COUNT,
        COUNT(DISTINCT CONVERSATION_ID) AS CONVERSATION_COUNT,
        COUNT(DISTINCT CUSTOMER_ID) AS UNIQUE_CUSTOMER_COUNT,

        -- Handle time metrics
        AVG(HANDLE_TIME_SECONDS) AS AVG_HANDLE_TIME,
        SUM(HANDLE_TIME_SECONDS) AS TOTAL_HANDLE_TIME,
        MIN(HANDLE_TIME_SECONDS) AS MIN_HANDLE_TIME,
        MAX(HANDLE_TIME_SECONDS) AS MAX_HANDLE_TIME,

        -- Hold metrics
        AVG(HOLD_TIME_SECONDS) AS AVG_HOLD_TIME,
        SUM(HOLD_TIME_SECONDS) AS TOTAL_HOLD_TIME,
        COUNT(CASE WHEN HOLD_TIME_SECONDS > 0 THEN 1 END) AS CONTACTS_WITH_HOLD,

        -- AHT
        AVG(AHT_SECONDS) AS AVG_AHT,

        -- Quality scores (if available)
        AVG(CSAT_SCORE) AS AVG_CSAT,
        AVG(QA_SCORE) AS AVG_QA_SCORE,

        -- Response time
        AVG(FIRST_RESPONSE_TIME_SECONDS) AS AVG_FIRST_RESPONSE_TIME,

        -- ==========================================
        -- CLIENT-SPECIFIC KPIs (Warby Parker)
        -- ==========================================

        -- Channel breakdown
        COUNT(CASE WHEN CHANNEL = 'CHAT' THEN 1 END) AS CHAT_COUNT,
        COUNT(CASE WHEN CHANNEL = 'EMAIL' THEN 1 END) AS EMAIL_COUNT,
        COUNT(CASE WHEN CHANNEL = 'VOICE' THEN 1 END) AS VOICE_COUNT,
        COUNT(CASE WHEN CHANNEL = 'SMS' THEN 1 END) AS SMS_COUNT,

        -- Entry points
        COUNT(CASE WHEN ENTRY_POINT = 'IVR' THEN 1 END) AS IVR_ENTRY_COUNT,
        COUNT(CASE WHEN ENTRY_POINT = 'WEB' THEN 1 END) AS WEB_ENTRY_COUNT,

        -- Payment metrics
        AVG(CASE WHEN PAYMENT_AMOUNT IS NOT NULL THEN PAYMENT_AMOUNT END) AS AVG_PAYMENT_AMOUNT,
        SUM(CASE WHEN PAYMENT_AMOUNT IS NOT NULL THEN PAYMENT_AMOUNT END) AS TOTAL_PAYMENT_AMOUNT,
        COUNT(CASE WHEN PAYMENT_AMOUNT > 0 THEN 1 END) AS TRANSACTIONS_COUNT,

        -- Inbox categories (from glossary enrichment)
        COUNT(CASE WHEN INBOX_CATEGORY = 'SALES' THEN 1 END) AS SALES_CONTACTS,
        COUNT(CASE WHEN INBOX_CATEGORY = 'SUPPORT' THEN 1 END) AS SUPPORT_CONTACTS,
        COUNT(CASE WHEN INBOX_CATEGORY = 'RETURNS' THEN 1 END) AS RETURNS_CONTACTS,

        -- Metadata
        CURRENT_TIMESTAMP() AS DBT_UPDATED_AT

    FROM enriched_contacts
    GROUP BY 1, 2, 3, 4, 5, 6
)

SELECT * FROM daily_agent_metrics

Standard KPI Schema (for ALL clients):

-- Define this schema once, enforce across all clients

STANDARD_COLUMNS (required for every client):
  - KPI_DATE
  - AGENT_ID, AGENT_EMAIL, AGENT_FULL_NAME
  - AGENT_ROLE, AGENT_SITE
  - CONTACT_COUNT
  - CONVERSATION_COUNT
  - UNIQUE_CUSTOMER_COUNT
  - AVG_HANDLE_TIME, TOTAL_HANDLE_TIME
  - AVG_HOLD_TIME, TOTAL_HOLD_TIME
  - AVG_AHT
  - AVG_CSAT, AVG_QA_SCORE
  - AVG_FIRST_RESPONSE_TIME

CLIENT_SPECIFIC_COLUMNS (vary by client):
  - Warby Parker: CHAT_COUNT, IVR_ENTRY_COUNT, AVG_PAYMENT_AMOUNT
  - Condé Nast: TOPIC_COUNT, ARTICLE_LINKS_SENT
  - Keller Postman: CASE_TYPE_COUNT, LEGAL_CONSULTATIONS

LAYER 3C: GOLD - Views (Consumption)¶

Purpose: Specific views for different consumption patterns

Who creates it: dbt

What it does: - ✅ Tableau-specific views (denormalized, optimized for dashboards) - ✅ Ad-hoc analyst views (with helpful calculations pre-computed) - ✅ Executive summary views (high-level aggregations)

Naming convention: {PURPOSE}

Materialization: view (not table - no storage cost)

Example Views:

WBP_DB.GOLD.AGENT_DASHBOARD          -- Tableau dashboard view
WBP_DB.GOLD.AGENT_AD_HOC             -- Analyst exploration view
WBP_DB.GOLD.EXECUTIVE_SUMMARY        -- Executive rollup view
WBP_DB.GOLD.AGENT_PERFORMANCE        -- Manager performance view

Gold View Pattern (Tableau Dashboard):

 {{     }}      --              --

href="#__codelineno-11-1">-- models/gold/warbyparker/agent_dashboard.sql config( materialized='view', -- Not a table, just a view schema='gold' ) WITH agent_kpis AS ( SELECT * FROM {{ ref('agent_kpis_daily') }} ), Current roster (for latest employee info) current_roster AS ( SELECT EID AS AGENT_ID, FULL_NAME, ROLE, SITE, DIRECT_LEADER, STATUS, HIRE_DATE, START_DATE_IN_ACCOUNT FROM {{ source('core_rosters', 'ROSTER_WARBYPARKER') }} ), Performance targets (if available) targets AS ( SELECT TARGET_DATE, AHT_TARGET, CSAT_TARGET, QA_TARGET, HANDLE_TIME_TARGET FROM {{ source('core_targets', 'WARBYPARKER_TARGETS') }} ), dashboard AS ( SELECT -- KPI data kpis.KPI_DATE, kpis.AGENT_ID, kpis.AGENT_EMAIL, kpis.AGENT_FULL_NAME, -- Current roster info (may differ from historical) roster.ROLE AS CURRENT_ROLE, roster.SITE AS CURRENT_SITE, roster.DIRECT_LEADER, roster.STATUS AS CURRENT_STATUS, roster.HIRE_DATE, DATEDIFF(day, roster.START_DATE_IN_ACCOUNT, kpis.KPI_DATE) AS DAYS_IN_ACCOUNT, -- Volume metrics kpis.CONTACT_COUNT, kpis.CONVERSATION_COUNT, kpis.UNIQUE_CUSTOMER_COUNT, -- Performance metrics kpis.AVG_HANDLE_TIME, kpis.AVG_AHT, kpis.AVG_CSAT, kpis.AVG_QA_SCORE, -- Targets targets.AHT_TARGET, targets.CSAT_TARGET, targets.QA_TARGET, -- Variance calculations kpis.AVG_AHT - targets.AHT_TARGET AS AHT_VARIANCE, kpis.AVG_CSAT - targets.CSAT_TARGET AS CSAT_VARIANCE, kpis.AVG_QA_SCORE - targets.QA_TARGET AS QA_VARIANCE, -- Performance flags (for conditional formatting in Tableau) CASE WHEN kpis.AVG_AHT <= targets.AHT_TARGET THEN 'Meeting Target' WHEN kpis.AVG_AHT <= targets.AHT_TARGET * 1.1 THEN 'Close to Target' ELSE 'Below Target' END AS AHT_PERFORMANCE, CASE WHEN kpis.AVG_CSAT >= targets.CSAT_TARGET THEN 'Meeting Target' WHEN kpis.AVG_CSAT >= targets.CSAT_TARGET * 0.9 THEN 'Close to Target' ELSE 'Below Target' END AS CSAT_PERFORMANCE, -- Channel breakdown kpis.CHAT_COUNT, kpis.EMAIL_COUNT, kpis.VOICE_COUNT, -- Client-specific kpis.IVR_ENTRY_COUNT, kpis.AVG_PAYMENT_AMOUNT, kpis.TOTAL_PAYMENT_AMOUNT FROM agent_kpis kpis LEFT JOIN current_roster roster ON kpis.AGENT_ID = roster.AGENT_ID LEFT JOIN targets ON kpis.KPI_DATE = targets.TARGET_DATE class="p">) class="k">SELECT * FROM dashboard

Benefits: - ✅ No storage cost (views are just queries) - ✅ Always reflects latest roster/target data - ✅ Tableau-optimized (denormalized, pre-calculated) - ✅ Easy to modify without rebuilding data

Complete Data Flow¶

┌─────────────────────────────────────────────────────────────────┐
│ Step 1: Copy to Snowflake                                       │
│ xo-foundry → BRONZE.GLADLY_CONTACT_TIMESTAMPS                   │
│   - Raw data (all VARCHAR)                                      │
│   - Truncated daily (idempotent)                                │
│   - Metadata added (RECORD_KEY, RECORD_HASH, etc.)              │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│ Step 2: Merge into historical table with record metadata        │
│ dbt → SILVER.CONTACT_TIMESTAMPS                                 │
│   - Append-only (INSERT WHERE NOT EXISTS on RECORD_KEY)         │
│   - Type conversion (VARCHAR → proper types)                    │
│   - Preserve metadata from Bronze                               │
│   - NO enrichment, NO filtering (pure preservation)             │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│ Steps 3 & 4: Add XO employee + client metadata                  │
│ dbt → GOLD.CONTACTS (enriched core entity)                      │
│   - JOIN with CORE.ROSTER (employee metadata)                   │
│   - JOIN with CORE.INBOX_GLOSSARY (client metadata)             │
│   - JOIN with CORE.QUEUE_GLOSSARY (client metadata)             │
│   - Filter to XO agents only (WHERE IS_XO_AGENT = TRUE)         │
│   - Latest state or full history                                │
│   ← THIS IS THE SOURCE OF TRUTH FOR ANALYSTS                    │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│ Step 5: Create SOT KPI table                                    │
│ dbt → GOLD.AGENT_KPIS_DAILY                                     │
│   - Aggregate from GOLD.CONTACTS                                │
│   - Standard KPI schema (all clients share)                     │
│   - Client-specific KPIs (additional columns)                   │
│   - Pre-aggregated for performance                              │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│ Step 6: Create views for consumption                            │
│ dbt → GOLD.AGENT_DASHBOARD (view)                               │
│      GOLD.AGENT_AD_HOC (view)                                   │
│   - Tableau-specific views                                      │
│   - Ad-hoc analyst views                                        │
│   - No storage cost (just queries)                              │
└─────────────────────────────────────────────────────────────────┘

Performance Considerations¶

Why Denormalize in Gold?¶

Storage is cheap, compute is expensive in Snowflake:

Option A: Join at query time (BAD)

-- Every Tableau query runs this JOIN
SELECT
    contacts.*,
    roster.*,
    glossary.*
FROM SILVER.CONTACTS contacts
JOIN CORE.ROSTER roster ON ...
JOIN CORE.GLOSSARY glossary ON ...
WHERE ...

- ❌ Expensive joins on every query - ❌ Slow dashboard load times - ❌ High compute cost

Option B: Denormalize once in Gold (GOOD)

-- Gold table already has everything joined
SELECT * FROM GOLD.CONTACTS
WHERE ...

- ✅ Fast queries (no joins) - ✅ Low compute cost - ✅ Better user experience

Trade-off: More storage, but storage is ~10x cheaper than compute in Snowflake.

Incremental vs Full Refresh¶

Silver (historical preservation): - Use incremental with append strategy - Only process new records each run - Fast and efficient

Gold enriched (core entities): - Option A: table materialized, rebuild daily - Simpler logic - Always fresh with latest rosters/glossaries - Good for tables under 100M rows

Option B: incremental with merge logic
More complex
Faster for very large tables
Need to handle roster/glossary changes

Recommendation: Start with Option A (table), move to Option B if performance becomes an issue.

Gold metrics (aggregated): - Use table, rebuild daily from Gold enriched - Pre-aggregated = fast for dashboards - Can switch to incremental for very large time ranges

Naming Convention (Final Recommendation)¶

Drop ALL Prefixes¶

Principle: Schema name provides context, don't repeat it

-- ❌ OLD (redundant)
SILVER.SILVER_WARBYPARKER_CONTACT_TIMESTAMPS
SILVER.WH_CONVERSATIONS
GOLD.DM_SCORECARD_GPP

-- ✅ NEW (clean)
SILVER.CONTACT_TIMESTAMPS
SILVER.CONVERSATIONS
GOLD.AGENT_KPIS_DAILY

Full Naming Convention:¶

Bronze: {SOURCE}_{OBJECT}

BRONZE.GLADLY_CONTACT_TIMESTAMPS
BRONZE.GLADLY_CONVERSATION_TIMESTAMPS
BRONZE.SPROUT_MESSAGES

Silver: {OBJECT} (optionally {OBJECT}_HISTORY)

SILVER.CONTACT_TIMESTAMPS
SILVER.CONVERSATION_TIMESTAMPS
SILVER.MESSAGES

Gold Core: {OBJECT} (singular for dimension-like, plural for fact-like)

GOLD.CONTACTS              -- Enriched contact facts
GOLD.CONVERSATIONS         -- Enriched conversation facts
GOLD.AGENTS                -- Current agent dimension

Gold Metrics: {METRIC}_{GRAIN}

GOLD.AGENT_KPIS_DAILY
GOLD.CHANNEL_METRICS_HOURLY
GOLD.QUEUE_PERFORMANCE_DAILY
GOLD.EXECUTIVE_SUMMARY_WEEKLY

Gold Views: {PURPOSE}

GOLD.AGENT_DASHBOARD
GOLD.AGENT_AD_HOC
GOLD.EXECUTIVE_SUMMARY
GOLD.MANAGER_SCORECARD

Supporting Infrastructure¶

Employee Rosters¶

Location: CORE_DB (shared across clients)

Recommendation: Keep separate per client initially

CORE_DB.BRONZE.BAMBOOHR_WARBYPARKER_RAW
CORE_DB.SILVER.ROSTER_WARBYPARKER           -- Current roster
CORE_DB.SILVER.ROSTER_WARBYPARKER_HISTORY   -- SCD Type 2

CORE_DB.BRONZE.BAMBOOHR_CONDENAST_RAW
CORE_DB.SILVER.ROSTER_CONDENAST
CORE_DB.SILVER.ROSTER_CONDENAST_HISTORY

Future consolidation (when all clients standardized):

CORE_DB.SILVER.ROSTER_ALL
  - Add CLIENT_NAME column
  - Unified schema

Glossary Tables¶

Location: CORE_DB (shared reference data)

Naming: {CLIENT}_{TOPIC}_GLOSSARY

CORE_DB.SILVER.WARBYPARKER_INBOX_GLOSSARY
CORE_DB.SILVER.WARBYPARKER_QUEUE_GLOSSARY
CORE_DB.SILVER.WARBYPARKER_SCORECARD_GLOSSARY

CORE_DB.SILVER.CONDENAST_DISPOSITION_GLOSSARY
CORE_DB.SILVER.CONDENAST_TOPIC_GLOSSARY

Why CORE_DB? These are shared reference data, not client-specific facts.

Migration Path¶

Phase 1: Fix Silver (Remove Enrichment)¶

Current: Silver has employee + glossary joins Target: Silver is pure historical preservation

-- BEFORE (mixed concerns)
SILVER.SILVER_WARBYPARKER_CONTACT_TIMESTAMPS
  - Has employee JOINs
  - Has glossary JOINs
  - Has filtering

-- AFTER (single concern)
SILVER.CONTACT_TIMESTAMPS
  - No JOINs
  - No filtering
  - Just type conversion + deduplication

Phase 2: Create Gold Core (Enriched Entities)¶

New tables:

GOLD.CONTACTS              -- Enriched with all metadata
GOLD.CONVERSATIONS         -- Enriched with all metadata

These become the SOT for analysts.

Phase 3: Create Gold Metrics (KPIs)¶

New tables:

GOLD.AGENT_KPIS_DAILY
GOLD.CHANNEL_METRICS_HOURLY
GOLD.QUEUE_PERFORMANCE_DAILY

Define standard schema across all clients.

Phase 4: Create Gold Views (Consumption)¶

New views:

GOLD.AGENT_DASHBOARD       -- Tableau
GOLD.AGENT_AD_HOC          -- Analysts
GOLD.EXECUTIVE_SUMMARY     -- Executives

Phase 5: Deprecate Old Tables¶

Remove old OPERATIONS.AGGREGATE.* tables
Update Tableau connections to point to new Gold views

Summary of Recommendations¶

✅ Layer Responsibilities¶

Layer	Purpose	Enrichment?	Aggregation?	Filtering?
Bronze	Raw landing	❌ No	❌ No	❌ No
Silver	Historical preservation	❌ No	❌ No	❌ No
Gold Core	Enriched entities	✅ YES	❌ No	✅ YES
Gold Metrics	Aggregated KPIs	N/A	✅ YES	✅ YES
Gold Views	Consumption	N/A	Optional	Optional

✅ Naming Convention¶

Drop ALL prefixes (SILVER_, WH_, DM_)
Schema name provides layer context
Use descriptive table names

✅ Performance Strategy¶

Denormalize in Gold (storage is cheap)
Pre-aggregate common patterns (metrics layer)
Use views for flexible consumption (no storage cost)

✅ Separation of Concerns¶

Silver = Immutable history (never changes after written)
Gold = Business logic (rebuild as rosters/glossaries change)
Clear boundary between layers

Next Actions¶

Questions to answer:

Should we rename existing Silver tables to drop SILVER_ prefix?
When should we implement Gold enriched layer (GOLD.CONTACTS)?
What standard KPIs do ALL clients need? (define schema)
Should rosters stay separate or consolidate to ROSTER_ALL?
Where are glossaries sourced from (Google Sheets? Manual entry)?

Implementation priority:

✅ Bronze → Already correct
⚠️ Silver → Remove enrichment, keep simple
🎯 Gold Core → Add enrichment here (employee + glossary)
🎯 Gold Metrics → Create standard KPI schema
🎯 Gold Views → Tableau + analyst views

This architecture follows Snowflake + dbt best practices and optimizes for: - ✅ Performance (denormalize early, pre-aggregate) - ✅ Maintainability (clear separation of concerns) - ✅ Cost efficiency (storage cheap, compute expensive) - ✅ Analyst experience (wide tables, no joins needed)