ELT Layer Architecture - Best Practices & Recommendations¶
Date: 2026-01-05 Status: Recommendation Purpose: Define optimal ELT layer architecture based on Snowflake + dbt best practices
Analysis: Required Steps & Optimal Placement¶
Your Required Pipeline Steps:¶
- ✅ Copy to Snowflake
- ✅ Merge into historical table with record metadata
- ✅ Add XO employee metadata
- ✅ Add client-specific metadata (glossaries)
- ✅ Create SOT (Source of Truth) KPI table
- ✅ Create multiple views for consumption (analysts + Tableau)
❌ Problems with Current Approach¶
Issue 1: Redundant Table Prefixes¶
Current:
SILVER.SILVER_WARBYPARKER_CONTACT_TIMESTAMPS ❌ Redundant prefix
GOLD.DM_SCORECARD_GPP ❌ Unnecessary DM_ prefix
Better:
SILVER.CONTACT_TIMESTAMPS ✅ Schema name provides context
GOLD.AGENT_KPIS_DAILY ✅ Descriptive, no prefix needed
Principle: Schema name already indicates layer - don't repeat it in table name.
Issue 2: Mixing Concerns in Silver¶
Current plan (from earlier discussion): - Silver does: Historical preservation + Employee enrichment + Glossary enrichment + Filtering
Problem: This mixes multiple concerns in one layer - Historical preservation (append records) - Business enrichment (JOINs with rosters/glossaries) - Filtering logic (XO agents only)
Result: - Complex Silver models - Expensive joins on every incremental run - Hard to maintain
Issue 3: Unclear "Source of Truth"¶
What is the SOT for analysts? - Silver historical table? (has all records but no enrichment) - Gold KPI table? (aggregated, lost granularity) - Something in between?
Best practice: SOT should be the most granular enriched table - all records with all metadata.
✅ Recommended Architecture (Best Practices)¶
Layer Philosophy¶
Bronze: Immutable raw data (source of record) Silver: Historical preservation (source of history) Gold: Enriched + aggregated data (source of truth for business)
Key Principle: Each layer has a single clear purpose.
Optimal Layer Design¶
LAYER 1: BRONZE (Raw Landing)¶
Purpose: Immutable raw source data exactly as received
Who creates it: xo-foundry (Airflow)
Characteristics: - ✅ Truncated daily (idempotent loads) - ✅ All VARCHAR (no type conversion) - ✅ Metadata added: RECORD_KEY, RECORD_HASH, DATE_TO_WAREHOUSE - ✅ NO transformations - ✅ NO filtering - ✅ Temporary landing zone (doesn't preserve history)
Naming convention: {SOURCE}_{OBJECT}
Example:
WBP_DB.BRONZE.GLADLY_CONTACT_TIMESTAMPS
WBP_DB.BRONZE.GLADLY_CONVERSATION_TIMESTAMPS
WBP_DB.BRONZE.SPROUT_MESSAGES
✅ This is already correct - no changes needed.
LAYER 2: SILVER (Historical Preservation)¶
Purpose: Immutable historical record of all source data
Who creates it: dbt
What it does: - ✅ Preserve full history (append-only or SCD Type 2) - ✅ Type conversion (VARCHAR → proper types) - ✅ Column standardization (uppercase, underscores) - ✅ Deduplication (RECORD_KEY) - ⚠️ NO enrichment (no JOINs with other tables) - ⚠️ NO filtering (preserve all records) - ⚠️ NO aggregation (keep grain identical to Bronze)
Why no enrichment? - Performance: Joins are expensive on large fact tables - Maintainability: If roster/glossary structure changes, don't reprocess history - Separation of concerns: Silver = preservation, Gold = enrichment
Naming convention: {OBJECT}_HISTORY or just {OBJECT}
Materialization: incremental (append strategy)
Example:
-- Simple, fast, minimal transformation
WBP_DB.SILVER.CONTACT_TIMESTAMPS
- All Bronze columns (properly typed)
- Preserved metadata columns
- DBT audit columns
- NO employee metadata (that's Gold's job)
- NO glossary metadata (that's Gold's job)
- NO filtering (preserve everything)
Silver Model Pattern:
-- models/silver/warbyparker/contact_timestamps.sql
WITH bronze_source AS (
SELECT * FROM {{ source('bronze_warbyparker', 'GLADLY_CONTACT_TIMESTAMPS') }}
),
typed_and_cleaned AS (
SELECT
-- Type conversions
CAST(CONTACT_ID AS VARCHAR) AS CONTACT_ID,
CAST(CONVERSATION_ID AS VARCHAR) AS CONVERSATION_ID,
TO_TIMESTAMP_NTZ(TIMESTAMP) AS TIMESTAMP,
-- All other columns (type-converted)
AGENT_NAME,
ASSIGNED_INBOX_ID,
ASSIGNED_INBOX_NAME,
CHANNEL,
-- ... all Bronze columns
-- Preserve metadata
RECORD_KEY,
RECORD_HASH,
DATE_TO_WAREHOUSE,
SOURCE_FILE,
BATCH_ID,
PIPELINE_RUN_ID,
-- Add dbt audit
CURRENT_TIMESTAMP() AS DBT_LOADED_AT,
'{{ invocation_id }}' AS DBT_INVOCATION_ID
FROM bronze_source
)
{% if is_incremental() %}
-- Only append new records
, new_records AS (
SELECT typed.*
FROM typed_and_cleaned typed
LEFT JOIN {{ this }} silver
ON typed.RECORD_KEY = silver.RECORD_KEY
WHERE silver.RECORD_KEY IS NULL
)
SELECT * FROM new_records
{% else %}
SELECT * FROM typed_and_cleaned
{% endif %}
Benefits: - ✅ Fast incremental runs (no expensive joins) - ✅ Immutable history (never changes) - ✅ Simple to maintain - ✅ Can reprocess Gold without touching Silver
LAYER 3A: GOLD - Core Entities (Enriched)¶
Purpose: Denormalized business entities with all context (SOT for analysts)
Who creates it: dbt
What it does: - ✅ JOIN with employee rosters - ✅ JOIN with client glossaries - ✅ Filter to XO agents only - ✅ Latest state or full history (depending on use case) - ✅ Wide denormalized tables (everything analysts need)
Why denormalize here? - Storage is cheap, compute is expensive - Rosters and glossaries are relatively static - Analysts query this layer - make it fast - Pre-joined is better than join-at-query-time
Naming convention: {OBJECT} or {OBJECT}_ENRICHED
Materialization:
- table (if keeping latest only)
- incremental (if keeping full history with enrichment)
Example Tables:
WBP_DB.GOLD.CONTACTS
- All Silver columns
- + XO employee metadata (AGENT_ID, EMAIL, ROLE, SITE, LEADER)
- + Client glossary metadata (INBOX_CATEGORY, QUEUE_TYPE)
- Filtered to XO agents only
- Latest state or full history
WBP_DB.GOLD.CONVERSATIONS
- All Silver columns
- + XO employee metadata
- + Client glossary metadata
- Filtered to XO agents only
Gold Core Model Pattern (Latest State):
-- models/gold/warbyparker/contacts.sql
{{
config(
materialized='table', -- Rebuilt daily with latest
schema='gold'
)
}}
WITH silver_history AS (
SELECT * FROM {{ ref('contact_timestamps') }}
),
-- Latest record per contact (if needed)
latest_contacts AS (
SELECT *
FROM silver_history
QUALIFY ROW_NUMBER() OVER (
PARTITION BY CONTACT_ID
ORDER BY TIMESTAMP DESC
) = 1
),
-- Employee roster (from CORE_DB)
employee_roster AS (
SELECT
GLADLY_NAME,
EID AS AGENT_ID,
FULL_NAME AS AGENT_FULL_NAME,
XO_EMAIL AS AGENT_EMAIL,
ACCOUNT_EMAIL AS AGENT_ACCOUNT_EMAIL,
ROLE AS AGENT_ROLE,
SITE AS AGENT_SITE,
DIRECT_LEADER AS AGENT_LEADER,
STATUS AS AGENT_STATUS
FROM {{ source('core_rosters', 'ROSTER_WARBYPARKER') }}
WHERE STATUS = 'Active' -- or include terminated based on needs
),
-- Client glossaries
inbox_glossary AS (
SELECT
INBOX_ID,
INBOX_NAME,
INBOX_CATEGORY,
INBOX_TYPE
FROM {{ source('core_glossaries', 'WARBYPARKER_INBOX_GLOSSARY') }}
),
queue_glossary AS (
SELECT
QUEUE_ID,
QUEUE_NAME,
QUEUE_CATEGORY,
QUEUE_TYPE
FROM {{ source('core_glossaries', 'WARBYPARKER_QUEUE_GLOSSARY') }}
),
-- Enriched contacts (SOT for analysts)
enriched AS (
SELECT
-- Original Silver columns
contacts.*,
-- Employee metadata (XO agents)
roster.AGENT_ID,
roster.AGENT_FULL_NAME,
roster.AGENT_EMAIL,
roster.AGENT_ROLE,
roster.AGENT_SITE,
roster.AGENT_LEADER,
roster.AGENT_STATUS,
-- Glossary metadata
inbox.INBOX_CATEGORY,
inbox.INBOX_TYPE,
queue.QUEUE_CATEGORY,
queue.QUEUE_TYPE,
-- Derived flags
CASE
WHEN roster.AGENT_EMAIL LIKE 'xo.%@%' THEN TRUE
ELSE FALSE
END AS IS_XO_AGENT,
-- dbt metadata
CURRENT_TIMESTAMP() AS DBT_UPDATED_AT
FROM latest_contacts contacts
LEFT JOIN employee_roster roster
ON contacts.ASSIGNED_AGENT_NAME = roster.GLADLY_NAME
LEFT JOIN inbox_glossary inbox
ON contacts.ASSIGNED_INBOX_ID = inbox.INBOX_ID
LEFT JOIN queue_glossary queue
ON contacts.QUEUE_ID = queue.QUEUE_ID -- adjust based on actual schema
)
-- Filter to XO agents only
SELECT *
FROM enriched
WHERE IS_XO_AGENT = TRUE
Benefits: - ✅ Analysts have everything in one table - ✅ No joins needed at query time - ✅ Fast queries - ✅ Latest state is clear - ✅ Can rebuild daily without touching Silver
LAYER 3B: GOLD - Metrics (Aggregated KPIs)¶
Purpose: Pre-aggregated business metrics for fast dashboard queries
Who creates it: dbt
What it does: - ✅ Aggregate from Gold enriched tables - ✅ Common grains: daily, hourly, by agent, by channel - ✅ Standardized KPI schema across clients - ✅ Client-specific metrics as additional columns
Naming convention: {METRIC}_{GRAIN}
Materialization: table or incremental
Example Tables:
WBP_DB.GOLD.AGENT_KPIS_DAILY
- Aggregated by: KPI_DATE, AGENT_ID
- Standard KPIs: AVG_HANDLE_TIME, CONTACT_COUNT, AVG_CSAT, etc.
- Client-specific KPIs: CHAT_COUNT, IVR_ENTRY_COUNT, etc.
WBP_DB.GOLD.CHANNEL_METRICS_HOURLY
- Aggregated by: HOUR, CHANNEL
- Metrics: VOLUME, AVG_AHT, etc.
WBP_DB.GOLD.QUEUE_PERFORMANCE_DAILY
- Aggregated by: DATE, QUEUE_ID
- Metrics: SLA%, AVG_WAIT_TIME, etc.
Gold Metrics Model Pattern:
-- models/gold/warbyparker/agent_kpis_daily.sql
{{
config(
materialized='table', -- Rebuild daily
schema='gold'
)
}}
WITH enriched_contacts AS (
SELECT * FROM {{ ref('contacts') }}
),
daily_agent_metrics AS (
SELECT
-- Dimensions
DATE(TIMESTAMP) AS KPI_DATE,
AGENT_ID,
AGENT_EMAIL,
AGENT_FULL_NAME,
AGENT_ROLE,
AGENT_SITE,
-- ==========================================
-- STANDARD KPIs (ALL CLIENTS MUST HAVE)
-- ==========================================
-- Volume metrics
COUNT(DISTINCT CONTACT_ID) AS CONTACT_COUNT,
COUNT(DISTINCT CONVERSATION_ID) AS CONVERSATION_COUNT,
COUNT(DISTINCT CUSTOMER_ID) AS UNIQUE_CUSTOMER_COUNT,
-- Handle time metrics
AVG(HANDLE_TIME_SECONDS) AS AVG_HANDLE_TIME,
SUM(HANDLE_TIME_SECONDS) AS TOTAL_HANDLE_TIME,
MIN(HANDLE_TIME_SECONDS) AS MIN_HANDLE_TIME,
MAX(HANDLE_TIME_SECONDS) AS MAX_HANDLE_TIME,
-- Hold metrics
AVG(HOLD_TIME_SECONDS) AS AVG_HOLD_TIME,
SUM(HOLD_TIME_SECONDS) AS TOTAL_HOLD_TIME,
COUNT(CASE WHEN HOLD_TIME_SECONDS > 0 THEN 1 END) AS CONTACTS_WITH_HOLD,
-- AHT
AVG(AHT_SECONDS) AS AVG_AHT,
-- Quality scores (if available)
AVG(CSAT_SCORE) AS AVG_CSAT,
AVG(QA_SCORE) AS AVG_QA_SCORE,
-- Response time
AVG(FIRST_RESPONSE_TIME_SECONDS) AS AVG_FIRST_RESPONSE_TIME,
-- ==========================================
-- CLIENT-SPECIFIC KPIs (Warby Parker)
-- ==========================================
-- Channel breakdown
COUNT(CASE WHEN CHANNEL = 'CHAT' THEN 1 END) AS CHAT_COUNT,
COUNT(CASE WHEN CHANNEL = 'EMAIL' THEN 1 END) AS EMAIL_COUNT,
COUNT(CASE WHEN CHANNEL = 'VOICE' THEN 1 END) AS VOICE_COUNT,
COUNT(CASE WHEN CHANNEL = 'SMS' THEN 1 END) AS SMS_COUNT,
-- Entry points
COUNT(CASE WHEN ENTRY_POINT = 'IVR' THEN 1 END) AS IVR_ENTRY_COUNT,
COUNT(CASE WHEN ENTRY_POINT = 'WEB' THEN 1 END) AS WEB_ENTRY_COUNT,
-- Payment metrics
AVG(CASE WHEN PAYMENT_AMOUNT IS NOT NULL THEN PAYMENT_AMOUNT END) AS AVG_PAYMENT_AMOUNT,
SUM(CASE WHEN PAYMENT_AMOUNT IS NOT NULL THEN PAYMENT_AMOUNT END) AS TOTAL_PAYMENT_AMOUNT,
COUNT(CASE WHEN PAYMENT_AMOUNT > 0 THEN 1 END) AS TRANSACTIONS_COUNT,
-- Inbox categories (from glossary enrichment)
COUNT(CASE WHEN INBOX_CATEGORY = 'SALES' THEN 1 END) AS SALES_CONTACTS,
COUNT(CASE WHEN INBOX_CATEGORY = 'SUPPORT' THEN 1 END) AS SUPPORT_CONTACTS,
COUNT(CASE WHEN INBOX_CATEGORY = 'RETURNS' THEN 1 END) AS RETURNS_CONTACTS,
-- Metadata
CURRENT_TIMESTAMP() AS DBT_UPDATED_AT
FROM enriched_contacts
GROUP BY 1, 2, 3, 4, 5, 6
)
SELECT * FROM daily_agent_metrics
Standard KPI Schema (for ALL clients):
-- Define this schema once, enforce across all clients
STANDARD_COLUMNS (required for every client):
- KPI_DATE
- AGENT_ID, AGENT_EMAIL, AGENT_FULL_NAME
- AGENT_ROLE, AGENT_SITE
- CONTACT_COUNT
- CONVERSATION_COUNT
- UNIQUE_CUSTOMER_COUNT
- AVG_HANDLE_TIME, TOTAL_HANDLE_TIME
- AVG_HOLD_TIME, TOTAL_HOLD_TIME
- AVG_AHT
- AVG_CSAT, AVG_QA_SCORE
- AVG_FIRST_RESPONSE_TIME
CLIENT_SPECIFIC_COLUMNS (vary by client):
- Warby Parker: CHAT_COUNT, IVR_ENTRY_COUNT, AVG_PAYMENT_AMOUNT
- Condé Nast: TOPIC_COUNT, ARTICLE_LINKS_SENT
- Keller Postman: CASE_TYPE_COUNT, LEGAL_CONSULTATIONS
LAYER 3C: GOLD - Views (Consumption)¶
Purpose: Specific views for different consumption patterns
Who creates it: dbt
What it does: - ✅ Tableau-specific views (denormalized, optimized for dashboards) - ✅ Ad-hoc analyst views (with helpful calculations pre-computed) - ✅ Executive summary views (high-level aggregations)
Naming convention: {PURPOSE}
Materialization: view (not table - no storage cost)
Example Views:
WBP_DB.GOLD.AGENT_DASHBOARD -- Tableau dashboard view
WBP_DB.GOLD.AGENT_AD_HOC -- Analyst exploration view
WBP_DB.GOLD.EXECUTIVE_SUMMARY -- Executive rollup view
WBP_DB.GOLD.AGENT_PERFORMANCE -- Manager performance view
Gold View Pattern (Tableau Dashboard):
-- models/gold/warbyparker/agent_dashboard.sql
{{
config(
materialized='view', -- Not a table, just a view
schema='gold'
)
}}
WITH agent_kpis AS (
SELECT * FROM {{ ref('agent_kpis_daily') }}
),
-- Current roster (for latest employee info)
current_roster AS (
SELECT
EID AS AGENT_ID,
FULL_NAME,
ROLE,
SITE,
DIRECT_LEADER,
STATUS,
HIRE_DATE,
START_DATE_IN_ACCOUNT
FROM {{ source('core_rosters', 'ROSTER_WARBYPARKER') }}
),
-- Performance targets (if available)
targets AS (
SELECT
TARGET_DATE,
AHT_TARGET,
CSAT_TARGET,
QA_TARGET,
HANDLE_TIME_TARGET
FROM {{ source('core_targets', 'WARBYPARKER_TARGETS') }}
),
dashboard AS (
SELECT
-- KPI data
kpis.KPI_DATE,
kpis.AGENT_ID,
kpis.AGENT_EMAIL,
kpis.AGENT_FULL_NAME,
-- Current roster info (may differ from historical)
roster.ROLE AS CURRENT_ROLE,
roster.SITE AS CURRENT_SITE,
roster.DIRECT_LEADER,
roster.STATUS AS CURRENT_STATUS,
roster.HIRE_DATE,
DATEDIFF(day, roster.START_DATE_IN_ACCOUNT, kpis.KPI_DATE) AS DAYS_IN_ACCOUNT,
-- Volume metrics
kpis.CONTACT_COUNT,
kpis.CONVERSATION_COUNT,
kpis.UNIQUE_CUSTOMER_COUNT,
-- Performance metrics
kpis.AVG_HANDLE_TIME,
kpis.AVG_AHT,
kpis.AVG_CSAT,
kpis.AVG_QA_SCORE,
-- Targets
targets.AHT_TARGET,
targets.CSAT_TARGET,
targets.QA_TARGET,
-- Variance calculations
kpis.AVG_AHT - targets.AHT_TARGET AS AHT_VARIANCE,
kpis.AVG_CSAT - targets.CSAT_TARGET AS CSAT_VARIANCE,
kpis.AVG_QA_SCORE - targets.QA_TARGET AS QA_VARIANCE,
-- Performance flags (for conditional formatting in Tableau)
CASE
WHEN kpis.AVG_AHT <= targets.AHT_TARGET THEN 'Meeting Target'
WHEN kpis.AVG_AHT <= targets.AHT_TARGET * 1.1 THEN 'Close to Target'
ELSE 'Below Target'
END AS AHT_PERFORMANCE,
CASE
WHEN kpis.AVG_CSAT >= targets.CSAT_TARGET THEN 'Meeting Target'
WHEN kpis.AVG_CSAT >= targets.CSAT_TARGET * 0.9 THEN 'Close to Target'
ELSE 'Below Target'
END AS CSAT_PERFORMANCE,
-- Channel breakdown
kpis.CHAT_COUNT,
kpis.EMAIL_COUNT,
kpis.VOICE_COUNT,
-- Client-specific
kpis.IVR_ENTRY_COUNT,
kpis.AVG_PAYMENT_AMOUNT,
kpis.TOTAL_PAYMENT_AMOUNT
FROM agent_kpis kpis
LEFT JOIN current_roster roster
ON kpis.AGENT_ID = roster.AGENT_ID
LEFT JOIN targets
ON kpis.KPI_DATE = targets.TARGET_DATE
)
SELECT * FROM dashboard
Benefits: - ✅ No storage cost (views are just queries) - ✅ Always reflects latest roster/target data - ✅ Tableau-optimized (denormalized, pre-calculated) - ✅ Easy to modify without rebuilding data
Complete Data Flow¶
┌─────────────────────────────────────────────────────────────────┐
│ Step 1: Copy to Snowflake │
│ xo-foundry → BRONZE.GLADLY_CONTACT_TIMESTAMPS │
│ - Raw data (all VARCHAR) │
│ - Truncated daily (idempotent) │
│ - Metadata added (RECORD_KEY, RECORD_HASH, etc.) │
└─────────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ Step 2: Merge into historical table with record metadata │
│ dbt → SILVER.CONTACT_TIMESTAMPS │
│ - Append-only (INSERT WHERE NOT EXISTS on RECORD_KEY) │
│ - Type conversion (VARCHAR → proper types) │
│ - Preserve metadata from Bronze │
│ - NO enrichment, NO filtering (pure preservation) │
└─────────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ Steps 3 & 4: Add XO employee + client metadata │
│ dbt → GOLD.CONTACTS (enriched core entity) │
│ - JOIN with CORE.ROSTER (employee metadata) │
│ - JOIN with CORE.INBOX_GLOSSARY (client metadata) │
│ - JOIN with CORE.QUEUE_GLOSSARY (client metadata) │
│ - Filter to XO agents only (WHERE IS_XO_AGENT = TRUE) │
│ - Latest state or full history │
│ ← THIS IS THE SOURCE OF TRUTH FOR ANALYSTS │
└─────────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ Step 5: Create SOT KPI table │
│ dbt → GOLD.AGENT_KPIS_DAILY │
│ - Aggregate from GOLD.CONTACTS │
│ - Standard KPI schema (all clients share) │
│ - Client-specific KPIs (additional columns) │
│ - Pre-aggregated for performance │
└─────────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ Step 6: Create views for consumption │
│ dbt → GOLD.AGENT_DASHBOARD (view) │
│ GOLD.AGENT_AD_HOC (view) │
│ - Tableau-specific views │
│ - Ad-hoc analyst views │
│ - No storage cost (just queries) │
└─────────────────────────────────────────────────────────────────┘
Performance Considerations¶
Why Denormalize in Gold?¶
Storage is cheap, compute is expensive in Snowflake:
Option A: Join at query time (BAD)
-- Every Tableau query runs this JOIN
SELECT
contacts.*,
roster.*,
glossary.*
FROM SILVER.CONTACTS contacts
JOIN CORE.ROSTER roster ON ...
JOIN CORE.GLOSSARY glossary ON ...
WHERE ...
Option B: Denormalize once in Gold (GOOD)
- ✅ Fast queries (no joins) - ✅ Low compute cost - ✅ Better user experienceTrade-off: More storage, but storage is ~10x cheaper than compute in Snowflake.
Incremental vs Full Refresh¶
Silver (historical preservation):
- Use incremental with append strategy
- Only process new records each run
- Fast and efficient
Gold enriched (core entities):
- Option A: table materialized, rebuild daily
- Simpler logic
- Always fresh with latest rosters/glossaries
- Good for tables under 100M rows
- Option B:
incrementalwith merge logic - More complex
- Faster for very large tables
- Need to handle roster/glossary changes
Recommendation: Start with Option A (table), move to Option B if performance becomes an issue.
Gold metrics (aggregated):
- Use table, rebuild daily from Gold enriched
- Pre-aggregated = fast for dashboards
- Can switch to incremental for very large time ranges
Naming Convention (Final Recommendation)¶
Drop ALL Prefixes¶
Principle: Schema name provides context, don't repeat it
-- ❌ OLD (redundant)
SILVER.SILVER_WARBYPARKER_CONTACT_TIMESTAMPS
SILVER.WH_CONVERSATIONS
GOLD.DM_SCORECARD_GPP
-- ✅ NEW (clean)
SILVER.CONTACT_TIMESTAMPS
SILVER.CONVERSATIONS
GOLD.AGENT_KPIS_DAILY
Full Naming Convention:¶
Bronze: {SOURCE}_{OBJECT}
Silver: {OBJECT} (optionally {OBJECT}_HISTORY)
Gold Core: {OBJECT} (singular for dimension-like, plural for fact-like)
GOLD.CONTACTS -- Enriched contact facts
GOLD.CONVERSATIONS -- Enriched conversation facts
GOLD.AGENTS -- Current agent dimension
Gold Metrics: {METRIC}_{GRAIN}
GOLD.AGENT_KPIS_DAILY
GOLD.CHANNEL_METRICS_HOURLY
GOLD.QUEUE_PERFORMANCE_DAILY
GOLD.EXECUTIVE_SUMMARY_WEEKLY
Gold Views: {PURPOSE}
Supporting Infrastructure¶
Employee Rosters¶
Location: CORE_DB (shared across clients)
Recommendation: Keep separate per client initially
CORE_DB.BRONZE.BAMBOOHR_WARBYPARKER_RAW
CORE_DB.SILVER.ROSTER_WARBYPARKER -- Current roster
CORE_DB.SILVER.ROSTER_WARBYPARKER_HISTORY -- SCD Type 2
CORE_DB.BRONZE.BAMBOOHR_CONDENAST_RAW
CORE_DB.SILVER.ROSTER_CONDENAST
CORE_DB.SILVER.ROSTER_CONDENAST_HISTORY
Future consolidation (when all clients standardized):
Glossary Tables¶
Location: CORE_DB (shared reference data)
Naming: {CLIENT}_{TOPIC}_GLOSSARY
CORE_DB.SILVER.WARBYPARKER_INBOX_GLOSSARY
CORE_DB.SILVER.WARBYPARKER_QUEUE_GLOSSARY
CORE_DB.SILVER.WARBYPARKER_SCORECARD_GLOSSARY
CORE_DB.SILVER.CONDENAST_DISPOSITION_GLOSSARY
CORE_DB.SILVER.CONDENAST_TOPIC_GLOSSARY
Why CORE_DB? These are shared reference data, not client-specific facts.
Migration Path¶
Phase 1: Fix Silver (Remove Enrichment)¶
Current: Silver has employee + glossary joins Target: Silver is pure historical preservation
-- BEFORE (mixed concerns)
SILVER.SILVER_WARBYPARKER_CONTACT_TIMESTAMPS
- Has employee JOINs
- Has glossary JOINs
- Has filtering
-- AFTER (single concern)
SILVER.CONTACT_TIMESTAMPS
- No JOINs
- No filtering
- Just type conversion + deduplication
Phase 2: Create Gold Core (Enriched Entities)¶
New tables:
These become the SOT for analysts.
Phase 3: Create Gold Metrics (KPIs)¶
New tables:
Define standard schema across all clients.
Phase 4: Create Gold Views (Consumption)¶
New views:
Phase 5: Deprecate Old Tables¶
- Remove old
OPERATIONS.AGGREGATE.*tables - Update Tableau connections to point to new Gold views
Summary of Recommendations¶
✅ Layer Responsibilities¶
| Layer | Purpose | Enrichment? | Aggregation? | Filtering? |
|---|---|---|---|---|
| Bronze | Raw landing | ❌ No | ❌ No | ❌ No |
| Silver | Historical preservation | ❌ No | ❌ No | ❌ No |
| Gold Core | Enriched entities | ✅ YES | ❌ No | ✅ YES |
| Gold Metrics | Aggregated KPIs | N/A | ✅ YES | ✅ YES |
| Gold Views | Consumption | N/A | Optional | Optional |
✅ Naming Convention¶
- Drop ALL prefixes (SILVER_, WH_, DM_)
- Schema name provides layer context
- Use descriptive table names
✅ Performance Strategy¶
- Denormalize in Gold (storage is cheap)
- Pre-aggregate common patterns (metrics layer)
- Use views for flexible consumption (no storage cost)
✅ Separation of Concerns¶
- Silver = Immutable history (never changes after written)
- Gold = Business logic (rebuild as rosters/glossaries change)
- Clear boundary between layers
Next Actions¶
Questions to answer:
- Should we rename existing Silver tables to drop
SILVER_prefix? - When should we implement Gold enriched layer (GOLD.CONTACTS)?
- What standard KPIs do ALL clients need? (define schema)
- Should rosters stay separate or consolidate to ROSTER_ALL?
- Where are glossaries sourced from (Google Sheets? Manual entry)?
Implementation priority:
- ✅ Bronze → Already correct
- ⚠️ Silver → Remove enrichment, keep simple
- 🎯 Gold Core → Add enrichment here (employee + glossary)
- 🎯 Gold Metrics → Create standard KPI schema
- 🎯 Gold Views → Tableau + analyst views
This architecture follows Snowflake + dbt best practices and optimizes for: - ✅ Performance (denormalize early, pre-aggregate) - ✅ Maintainability (clear separation of concerns) - ✅ Cost efficiency (storage cheap, compute expensive) - ✅ Analyst experience (wide tables, no joins needed)