Playbook: Expanding the Gold Layer¶
Purpose: Step-by-step guide to add new Gold models (facts, dims, aggregates, or reports) to an existing client. Assumes Silver models for the client already exist.
Architecture reference: All conventions follow ADR 019 — Gold Layer Standardization. Project uses domain-first structure (
models/{client}/gold/).
Before You Start¶
Confirm:
- [ ] Silver models for this client are deployed and populated
- [ ] CORE_DB.SILVER.ROSTER_{CLIENT} is available (if building facts that need agent enrichment)
- [ ] Client glossary exists if needed (e.g., {CLIENT}_DB.SILVER.INBOX_GLOSSARY)
- [ ] You know the grain of the new model (one row per what?)
- [ ] Check dbt-model-registry.md — does this model already exist?
Decision: Which Gold Type?¶
| You want to... | Build a... | Prefix |
|---|---|---|
| Enrich raw events with roster/glossary data | Fact | fct_ |
| Track current state of an entity (agent, contact) | Dimension | dim_ |
| Pre-aggregate a metric for reuse across reports | Aggregate | agg_ |
| Create a Tableau-ready consumption view | Report | rpt_ |
Reports must source from Aggregates only (never Facts or Silver). See ADR 019.
Phase 1 — Gold Fact Model (fct_)¶
Purpose: Enriched event-level data. Joins Silver events with rosters and glossaries.
Step 1.1 — Create the fact model¶
Location: apps/dbt/xo-medallion/models/{client_lower}/gold/fct_{client}_{object}.sql
{{ config(
schema='GOLD',
materialized='table'
) }}
-- Database inherited from dbt_project.yml at client folder level (ADR 019)
WITH contacts AS (
SELECT * FROM {{ ref('wbp_contact_timestamps') }} -- Silver model (client-prefixed)
),
roster AS (
SELECT * FROM {{ ref('bi_roster_warbyparker') }} -- Core Silver roster via ref()
)
SELECT
c.CONTACT_ID,
c.CREATED_AT,
c.AGENT_ID,
-- Roster enrichment (LEFT JOIN — keep ALL records)
{{ roster_columns('r') }},
-- Metadata
c.RECORD_KEY,
c.DATE_TO_WAREHOUSE
FROM contacts c
LEFT JOIN roster r ON c.AGENT_ID = r.GLADLY_ID -- LEFT JOIN, not INNER
Note on
dynamic_table: The default materialization forfct_,dim_, andagg_istable. Usedynamic_tableonly when Snowflake should own the refresh schedule (e.g., continuous/streaming pipelines, sub-hourly lag requirements). For all daily Airflow-orchestrated Gold models, usetable.
Rules for facts (ADR 019):
- Source from Silver only (never Bronze)
- LEFT JOIN to roster — facts contain ALL records, not just XO agents
- Add IS_XO_AGENT flag: CASE WHEN r.EID IS NOT NULL THEN TRUE ELSE FALSE END
- Aggregates and reports filter WHERE is_xo_agent = TRUE
- Same grain as the Silver source (event-level)
- No aggregation
- Include RECORD_KEY and DATE_TO_WAREHOUSE for traceability
Step 1.2 — Add schema.yml entry¶
models:
- name: fct_wbp_contacts
description: >
Enriched contact events from Gladly, joined with XO agent roster.
Grain: one row per contact (CONTACT_ID). Full dataset with IS_XO_AGENT flag.
columns:
- name: contact_id
description: "Gladly contact ID — primary key"
tests:
- unique
- not_null
- name: agent_id
description: "XO agent identifier — foreign key to dim_agent"
tests:
- not_null
- name: created_at
description: "Contact creation timestamp (UTC)"
tests:
- not_null
Phase 2 — Gold Aggregate Model (agg_)¶
Purpose: Pre-aggregated metrics, reusable building blocks for multiple report views.
Naming: agg_{client}_{subject}_{grain} — grain is always the last token.
Step 2.1 — Create the aggregate model¶
Location: apps/dbt/xo-medallion/models/{client_lower}/gold/agg_{client}_{subject}_{grain}.sql
{{ config(
schema='GOLD',
materialized='table'
) }}
-- Database inherited from dbt_project.yml at client folder level (ADR 019)
WITH facts AS (
SELECT * FROM {{ ref('fct_wbp_contacts') }} -- Source from facts, not Silver
WHERE is_xo_agent = TRUE -- XO filter at aggregate level
)
SELECT
CREATED_AT::DATE AS date,
EID,
AGENT_NAME,
-- Store sums and counts, NEVER averages (ADR 019)
COUNT(DISTINCT CONTACT_ID) AS contact_count,
SUM(handle_time_sec) AS total_handle_time_sec,
SUM(acw_sec) AS total_acw_sec
FROM facts
GROUP BY 1, 2, 3
Rules for aggregates (ADR 019):
- Source from Gold facts only (never Silver, never Bronze, never roster directly)
- Daily grain only — weekly/monthly/quarterly rollups happen in report views via DATE_TRUNC
- Filter WHERE is_xo_agent = TRUE (inherited from fact's IS_XO_AGENT flag)
- Store sums and counts, never averages — downstream computes avg = SUM(x) / SUM(count)
- Include dimension columns needed for slicing (EID, AGENT_NAME, etc.)
- No JOINs — facts already have enrichment
- If an aggregate needs a column not in a fact, add it to the fact
Step 2.2 — Add schema.yml entry¶
- name: agg_wbp_contacts_daily
description: >
Daily contact volume per XO agent. Grain: one row per (eid, date).
Source: fct_wbp_contacts. Reusable building block for agent performance reports.
columns:
- name: agent_id
tests: [not_null]
- name: contact_date
tests: [not_null]
- name: total_contacts
description: "Total contacts handled by agent on this date"
Phase 3 — Gold Report View (rpt_)¶
Purpose: Tableau-ready consumption layer. Always a view (zero storage).
Naming: rpt_{client}_{purpose}_{grain} — include _daily, _weekly, etc. as the last token.
Step 3.0 — Determine the grain and semantic category¶
Before writing a line of SQL, answer two questions:
1. What is the grain?
| Grain | When | Example |
|---|---|---|
date × eid × channel |
Agent-level metrics that split by channel | rpt_wbp_agent_performance_channel_daily |
date × eid |
Rate metrics whose denominator is agent-day total (ACW%, EPH, MPH, CPH) | rpt_wbp_agent_productivity_daily |
date × channel |
Account-level rollups, no agent dimension | rpt_wbp_account_performance_channel_daily |
2. Should the view pre-compute rates or keep raw sums?
- Agent-grained views → keep raw sums and counts. Tableau re-aggregates across agents; pre-computing averages at agent grain breaks team-level or date-range aggregations.
- Account-grained views → pre-compute final rates (
AHT_SEC,SLA_PCT,CSAT_AVG, etc.) — consumed as-is, no further Tableau aggregation expected.
See dbt-models.md §5a for the full performance vs productivity split.
Step 3.1 — Create the report view¶
Location: apps/dbt/xo-medallion/models/{client_lower}/gold/rpt_{client}_{purpose}_{grain}.sql
Pattern A — multi-source agent-grained view (spine pattern)
Use this when joining multiple agg_ sources that may not share the same (date × eid × channel) combinations. Build a spine CTE first so no rows are silently dropped.
{{ config(schema='GOLD', materialized='view', tags=['warbyparker', 'gold', 'report']) }}
/*
GOLD Report: {Description}
Grain: one row per (date × EID × channel)
Sources: agg_wbp_agent_summary_channel_daily, agg_wbp_csat_channel_daily, ...
Target: WBP_DB.GOLD.RPT_WBP_{NAME}
*/
WITH spine AS (
-- UNION (not UNION ALL) normalizes channel aliases across sources
SELECT date, EID, channel FROM {{ ref('agg_wbp_agent_summary_channel_daily') }}
UNION
SELECT date, eid AS EID,
interaction_type AS channel FROM {{ ref('agg_wbp_csat_channel_daily') }}
UNION
SELECT date, eid AS EID,
channel_formatted AS channel FROM {{ ref('agg_wbp_qa_channel_daily') }}
)
SELECT
s.date,
s.EID,
r.FULL_NAME AS agent_name,
-- ... other roster attributes ...
s.channel,
-- Raw sums only — Tableau computes averages across agents
a.contacts_fulfilled,
a.total_handle_time_sec,
c.csat_stars_sum,
c.csat_survey_count
FROM spine s
LEFT JOIN {{ ref('bi_roster_warbyparker') }} r ON s.EID = r.EID
LEFT JOIN {{ ref('agg_wbp_agent_summary_channel_daily') }} a
ON s.date = a.date AND s.EID = a.EID AND s.channel = a.channel
LEFT JOIN {{ ref('agg_wbp_csat_channel_daily') }} c
ON s.date = c.date AND s.EID = c.eid AND s.channel = c.interaction_type
-- ... more LEFT JOINs using source-native channel column name in predicate
Pattern B — account-level rollup view
No agent dimension. Roll each agg_ up to date × channel first, then build spine and join.
{{ config(schema='GOLD', materialized='view', tags=['warbyparker', 'gold', 'report']) }}
WITH summary_rollup AS (
SELECT
date,
channel,
SUM(contacts_fulfilled) AS total_contacts_fulfilled,
SUM(total_handle_time_sec) AS total_handle_time_sec
FROM {{ ref('agg_wbp_agent_summary_channel_daily') }}
GROUP BY 1, 2
),
csat_rollup AS (
SELECT
date,
interaction_type AS channel,
SUM(csat_stars_sum) AS csat_stars_sum,
SUM(csat_survey_count) AS csat_survey_count
FROM {{ ref('agg_wbp_csat_channel_daily') }}
GROUP BY 1, 2
),
spine AS (
SELECT date, channel FROM summary_rollup
UNION
SELECT date, channel FROM csat_rollup
)
SELECT
s.date,
s.channel,
sm.total_contacts_fulfilled,
sm.total_handle_time_sec,
-- Pre-compute rates here (account grain — no Tableau re-aggregation expected)
ROUND(sm.total_handle_time_sec / NULLIF(sm.total_contacts_fulfilled, 0), 2) AS aht_sec,
c.csat_stars_sum,
c.csat_survey_count,
ROUND(c.csat_stars_sum / NULLIF(c.csat_survey_count, 0), 2) AS csat_avg
FROM spine s
LEFT JOIN summary_rollup sm ON s.date = sm.date AND s.channel = sm.channel
LEFT JOIN csat_rollup c ON s.date = c.date AND s.channel = c.channel
Rules for reports (ADR 019 + dbt-models.md §4.5):
- Materialized as view — never dynamic_table or table
- Source from aggregates only (never facts, never Silver)
- Use spine pattern when joining multiple agg_ sources
- Agent-grained views → raw sums and counts; account-grained views → pre-computed rates
- Roster attributes (agent_name, direct_leader, site, tenure_days) joined at this layer — not in agg_
- Use DATE_TRUNC + GROUP BY for weekly/monthly/quarterly rollups
- Use LAG() for period-over-period comparisons (WoW, MoM, QoQ)
Step 3.2 — Add schema.yml entry¶
- name: rpt_wbp_agent_performance_channel_daily
description: >
Daily agent performance metrics by channel. Grain: one row per (date × EID × channel).
Joins agg_wbp_agent_summary_channel_daily, agg_wbp_csat_channel_daily, and
agg_wbp_qa_channel_daily via spine pattern. Raw sums only — Tableau computes averages.
columns:
- name: date
tests: [not_null]
- name: eid
tests: [not_null]
- name: channel
tests: [not_null]
Phase 4 — Validation¶
Step 4.1 — Compile and test locally¶
# Compile all new models (domain-first structure)
dbt compile --select {client_lower}.*
# Run tests
dbt test --select {client_lower}.gold.*
# Check row counts make sense
# Facts should have same row count as Silver (before filtering)
# Aggs should have fewer rows (compressed by grain)
# Reports should have same rows as their agg source
Step 4.2 — Compare against existing (if replacing a legacy view)¶
If this Gold model replaces a legacy view in apps/snowflake-schema/:
-- Compare row counts
SELECT 'legacy' AS source, COUNT(*) FROM OPERATIONS.REPORTING.OLD_VIEW
UNION ALL
SELECT 'new' AS source, COUNT(*) FROM WBP_DB_DEV.GOLD.RPT_EMAIL_DAILY;
-- Compare spot-check values for a specific agent/date
Phase 5 — Deploy to Production¶
Step 5.1 — Deploy dbt to prod¶
Trigger the production Airflow DAG with ENVIRONMENT=prod. Verify Gold tables/views appear in {CLIENT}_DB.GOLD.
Step 5.2 — Update model registry¶
Add new models to dbt-model-registry.md.
Step 5.3 — Tableau update (if applicable)¶
- Update or create Tableau data source pointing to production
rpt_view - Validate field names match what Tableau expects
Checklist Summary¶
- Confirmed Silver models exist and are populated
- Checked dbt-model-registry — model doesn't already exist
- Chosen correct Gold type (fct/dim/agg/rpt)
- Model name follows naming convention
- Correct materialization (view for
rpt_, table forfct_/dim_/agg_) - Uses
env_var('ENVIRONMENT', 'dev')for database targeting - Facts use LEFT JOIN to roster and include
IS_XO_AGENTflag (full dataset) - Aggregates source from facts only, filter
WHERE is_xo_agent = TRUE - Reports source from aggregates only (never facts or Silver)
- schema.yml entry with description, grain, and key tests
-
dbt compilesucceeds -
dbt testpasses - Row counts validated
- dbt-model-registry.md updated
- Deployed to production