Playbook: Expanding the Gold Layer¶

Purpose: Step-by-step guide to add new Gold models (facts, dims, aggregates, or reports) to an existing client. Assumes Silver models for the client already exist.

Architecture reference: All conventions follow ADR 019 — Gold Layer Standardization. Project uses domain-first structure (models/{client}/gold/).

Before You Start¶

Confirm: - [ ] Silver models for this client are deployed and populated - [ ] CORE_DB.SILVER.ROSTER_{CLIENT} is available (if building facts that need agent enrichment) - [ ] Client glossary exists if needed (e.g., {CLIENT}_DB.SILVER.INBOX_GLOSSARY) - [ ] You know the grain of the new model (one row per what?) - [ ] Check dbt-model-registry.md — does this model already exist?

Decision: Which Gold Type?¶

You want to...	Build a...	Prefix
Enrich raw events with roster/glossary data	Fact	`fct_`
Track current state of an entity (agent, contact)	Dimension	`dim_`
Pre-aggregate a metric for reuse across reports	Aggregate	`agg_`
Create a Tableau-ready consumption view	Report	`rpt_`

Reports must source from Aggregates only (never Facts or Silver). See ADR 019.

Phase 1 — Gold Fact Model (`fct_`)¶

Purpose: Enriched event-level data. Joins Silver events with rosters and glossaries.

Step 1.1 — Create the fact model¶

Location: apps/dbt/xo-medallion/models/{client_lower}/gold/fct_{client}_{object}.sql

{{ config(
    schema='GOLD',
    materialized='table'
) }}
-- Database inherited from dbt_project.yml at client folder level (ADR 019)

WITH contacts AS (
    SELECT * FROM {{ ref('wbp_contact_timestamps') }}   -- Silver model (client-prefixed)
),

roster AS (
    SELECT * FROM {{ ref('bi_roster_warbyparker') }}    -- Core Silver roster via ref()
)

SELECT
    c.CONTACT_ID,
    c.CREATED_AT,
    c.AGENT_ID,

    -- Roster enrichment (LEFT JOIN — keep ALL records)
    {{ roster_columns('r') }},

    -- Metadata
    c.RECORD_KEY,
    c.DATE_TO_WAREHOUSE

FROM contacts c
LEFT JOIN roster r ON c.AGENT_ID = r.GLADLY_ID         -- LEFT JOIN, not INNER

Note on dynamic_table: The default materialization for fct_, dim_, and agg_ is table. Use dynamic_table only when Snowflake should own the refresh schedule (e.g., continuous/streaming pipelines, sub-hourly lag requirements). For all daily Airflow-orchestrated Gold models, use table.

Rules for facts (ADR 019): - Source from Silver only (never Bronze) - LEFT JOIN to roster — facts contain ALL records, not just XO agents - Add IS_XO_AGENT flag: CASE WHEN r.EID IS NOT NULL THEN TRUE ELSE FALSE END - Aggregates and reports filter WHERE is_xo_agent = TRUE - Same grain as the Silver source (event-level) - No aggregation - Include RECORD_KEY and DATE_TO_WAREHOUSE for traceability

Step 1.2 — Add schema.yml entry¶

models:
  - name: fct_wbp_contacts
    description: >
      Enriched contact events from Gladly, joined with XO agent roster.
      Grain: one row per contact (CONTACT_ID). Full dataset with IS_XO_AGENT flag.
    columns:
      - name: contact_id
        description: "Gladly contact ID — primary key"
        tests:
          - unique
          - not_null
      - name: agent_id
        description: "XO agent identifier — foreign key to dim_agent"
        tests:
          - not_null
      - name: created_at
        description: "Contact creation timestamp (UTC)"
        tests:
          - not_null

Phase 2 — Gold Aggregate Model (`agg_`)¶

Purpose: Pre-aggregated metrics, reusable building blocks for multiple report views.

Naming: agg_{client}_{subject}_{grain} — grain is always the last token.

Step 2.1 — Create the aggregate model¶

Location: apps/dbt/xo-medallion/models/{client_lower}/gold/agg_{client}_{subject}_{grain}.sql

{{ config(
    schema='GOLD',
    materialized='table'
) }}
-- Database inherited from dbt_project.yml at client folder level (ADR 019)

WITH facts AS (
    SELECT * FROM {{ ref('fct_wbp_contacts') }}     -- Source from facts, not Silver
    WHERE is_xo_agent = TRUE                        -- XO filter at aggregate level
)

SELECT
    CREATED_AT::DATE                                AS date,
    EID,
    AGENT_NAME,

    -- Store sums and counts, NEVER averages (ADR 019)
    COUNT(DISTINCT CONTACT_ID)                      AS contact_count,
    SUM(handle_time_sec)                            AS total_handle_time_sec,
    SUM(acw_sec)                                    AS total_acw_sec

FROM facts
GROUP BY 1, 2, 3

Rules for aggregates (ADR 019): - Source from Gold facts only (never Silver, never Bronze, never roster directly) - Daily grain only — weekly/monthly/quarterly rollups happen in report views via DATE_TRUNC - Filter WHERE is_xo_agent = TRUE (inherited from fact's IS_XO_AGENT flag) - Store sums and counts, never averages — downstream computes avg = SUM(x) / SUM(count) - Include dimension columns needed for slicing (EID, AGENT_NAME, etc.) - No JOINs — facts already have enrichment - If an aggregate needs a column not in a fact, add it to the fact

Step 2.2 — Add schema.yml entry¶

- name: agg_wbp_contacts_daily
  description: >
    Daily contact volume per XO agent. Grain: one row per (eid, date).
    Source: fct_wbp_contacts. Reusable building block for agent performance reports.
  columns:
    - name: agent_id
      tests: [not_null]
    - name: contact_date
      tests: [not_null]
    - name: total_contacts
      description: "Total contacts handled by agent on this date"

Phase 3 — Gold Report View (`rpt_`)¶

Purpose: Tableau-ready consumption layer. Always a view (zero storage).

Naming: rpt_{client}_{purpose}_{grain} — include _daily, _weekly, etc. as the last token.

Step 3.0 — Determine the grain and semantic category¶

Before writing a line of SQL, answer two questions:

1. What is the grain?

Grain	When	Example
`date × eid × channel`	Agent-level metrics that split by channel	`rpt_wbp_agent_performance_channel_daily`
`date × eid`	Rate metrics whose denominator is agent-day total (ACW%, EPH, MPH, CPH)	`rpt_wbp_agent_productivity_daily`
`date × channel`	Account-level rollups, no agent dimension	`rpt_wbp_account_performance_channel_daily`

2. Should the view pre-compute rates or keep raw sums?

Agent-grained views → keep raw sums and counts. Tableau re-aggregates across agents; pre-computing averages at agent grain breaks team-level or date-range aggregations.
Account-grained views → pre-compute final rates (AHT_SEC, SLA_PCT, CSAT_AVG, etc.) — consumed as-is, no further Tableau aggregation expected.

See dbt-models.md §5a for the full performance vs productivity split.

Step 3.1 — Create the report view¶

Location: apps/dbt/xo-medallion/models/{client_lower}/gold/rpt_{client}_{purpose}_{grain}.sql

Pattern A — multi-source agent-grained view (spine pattern)

Use this when joining multiple agg_ sources that may not share the same (date × eid × channel) combinations. Build a spine CTE first so no rows are silently dropped.

{{ config(schema='GOLD', materialized='view', tags=['warbyparker', 'gold', 'report']) }}

/*
GOLD Report: {Description}
Grain: one row per (date × EID × channel)
Sources: agg_wbp_agent_summary_channel_daily, agg_wbp_csat_channel_daily, ...
Target: WBP_DB.GOLD.RPT_WBP_{NAME}
*/

WITH spine AS (
    -- UNION (not UNION ALL) normalizes channel aliases across sources
    SELECT date, EID, channel              FROM {{ ref('agg_wbp_agent_summary_channel_daily') }}
    UNION
    SELECT date, eid AS EID,
           interaction_type AS channel     FROM {{ ref('agg_wbp_csat_channel_daily') }}
    UNION
    SELECT date, eid AS EID,
           channel_formatted AS channel   FROM {{ ref('agg_wbp_qa_channel_daily') }}
)

SELECT
    s.date,
    s.EID,
    r.FULL_NAME      AS agent_name,
    -- ... other roster attributes ...
    s.channel,

    -- Raw sums only — Tableau computes averages across agents
    a.contacts_fulfilled,
    a.total_handle_time_sec,
    c.csat_stars_sum,
    c.csat_survey_count

FROM spine s
LEFT JOIN {{ ref('bi_roster_warbyparker') }} r ON s.EID = r.EID
LEFT JOIN {{ ref('agg_wbp_agent_summary_channel_daily') }} a
    ON s.date = a.date AND s.EID = a.EID AND s.channel = a.channel
LEFT JOIN {{ ref('agg_wbp_csat_channel_daily') }} c
    ON s.date = c.date AND s.EID = c.eid AND s.channel = c.interaction_type
-- ... more LEFT JOINs using source-native channel column name in predicate

Pattern B — account-level rollup view

No agent dimension. Roll each agg_ up to date × channel first, then build spine and join.

{{ config(schema='GOLD', materialized='view', tags=['warbyparker', 'gold', 'report']) }}

WITH summary_rollup AS (
    SELECT
        date,
        channel,
        SUM(contacts_fulfilled)         AS total_contacts_fulfilled,
        SUM(total_handle_time_sec)      AS total_handle_time_sec
    FROM {{ ref('agg_wbp_agent_summary_channel_daily') }}
    GROUP BY 1, 2
),

csat_rollup AS (
    SELECT
        date,
        interaction_type    AS channel,
        SUM(csat_stars_sum) AS csat_stars_sum,
        SUM(csat_survey_count) AS csat_survey_count
    FROM {{ ref('agg_wbp_csat_channel_daily') }}
    GROUP BY 1, 2
),

spine AS (
    SELECT date, channel FROM summary_rollup
    UNION
    SELECT date, channel FROM csat_rollup
)

SELECT
    s.date,
    s.channel,
    sm.total_contacts_fulfilled,
    sm.total_handle_time_sec,

    -- Pre-compute rates here (account grain — no Tableau re-aggregation expected)
    ROUND(sm.total_handle_time_sec / NULLIF(sm.total_contacts_fulfilled, 0), 2) AS aht_sec,

    c.csat_stars_sum,
    c.csat_survey_count,
    ROUND(c.csat_stars_sum / NULLIF(c.csat_survey_count, 0), 2)  AS csat_avg

FROM spine s
LEFT JOIN summary_rollup sm ON s.date = sm.date AND s.channel = sm.channel
LEFT JOIN csat_rollup c    ON s.date = c.date  AND s.channel = c.channel

Rules for reports (ADR 019 + dbt-models.md §4.5): - Materialized as view — never dynamic_table or table - Source from aggregates only (never facts, never Silver) - Use spine pattern when joining multiple agg_ sources - Agent-grained views → raw sums and counts; account-grained views → pre-computed rates - Roster attributes (agent_name, direct_leader, site, tenure_days) joined at this layer — not in agg_ - Use DATE_TRUNC + GROUP BY for weekly/monthly/quarterly rollups - Use LAG() for period-over-period comparisons (WoW, MoM, QoQ)

Step 3.2 — Add schema.yml entry¶

- name: rpt_wbp_agent_performance_channel_daily
  description: >
    Daily agent performance metrics by channel. Grain: one row per (date × EID × channel).
    Joins agg_wbp_agent_summary_channel_daily, agg_wbp_csat_channel_daily, and
    agg_wbp_qa_channel_daily via spine pattern. Raw sums only — Tableau computes averages.
  columns:
    - name: date
      tests: [not_null]
    - name: eid
      tests: [not_null]
    - name: channel
      tests: [not_null]

Phase 4 — Validation¶

Step 4.1 — Compile and test locally¶

# Compile all new models (domain-first structure)
dbt compile --select {client_lower}.*

# Run tests
dbt test --select {client_lower}.gold.*

# Check row counts make sense
# Facts should have same row count as Silver (before filtering)
# Aggs should have fewer rows (compressed by grain)
# Reports should have same rows as their agg source

Step 4.2 — Compare against existing (if replacing a legacy view)¶

If this Gold model replaces a legacy view in apps/snowflake-schema/:

-- Compare row counts
SELECT 'legacy' AS source, COUNT(*) FROM OPERATIONS.REPORTING.OLD_VIEW
UNION ALL
SELECT 'new' AS source, COUNT(*) FROM WBP_DB_DEV.GOLD.RPT_EMAIL_DAILY;

-- Compare spot-check values for a specific agent/date

Phase 5 — Deploy to Production¶

Step 5.1 — Deploy dbt to prod¶

Trigger the production Airflow DAG with ENVIRONMENT=prod. Verify Gold tables/views appear in {CLIENT}_DB.GOLD.

Step 5.2 — Update model registry¶

Add new models to dbt-model-registry.md.

Step 5.3 — Tableau update (if applicable)¶

Update or create Tableau data source pointing to production rpt_ view
Validate field names match what Tableau expects

Checklist Summary¶

See also: ADR 008 | ADR 009 | ADR 010 | ADR 019