ADR 011: Bronze Batch-Scoped Replace¶
Status: Accepted Date: 2026-02-17 Supersedes: ADR 006 - Bronze Truncate Force (partially — legacy pipelines still use truncate)
Context¶
Bronze tables in the new architecture (per-client DBs + dbt) were truncated on every pipeline run (TRUNCATE TABLE + COPY INTO). This meant Bronze only held the latest load's data. If dbt --full-refresh was executed on Silver incremental models, Silver would be rebuilt from Bronze — losing all historical data except today's load.
Decision¶
Introduce batch_replace as a new Bronze load strategy for snowflake_load pipelines:
batch_replace: Deletes only the current batch's rows, then loads. Bronze accumulates historical data across batches.truncate_insert: (Legacy default) Truncates the entire table, then loads. Bronze only has current batch.
Implementation¶
| Component | Change |
|---|---|
dag_config.py |
Added batch_replace to SnowflakeSourceConfig.load_strategy Literal |
snowflake_tasks.py |
copy_to_snowflake branches on load_strategy: DELETE WHERE BATCH_ID vs TRUNCATE |
snowflake_load.py.j2 |
Passes source-level load_strategy override after **BASE_SNOWFLAKE_CONFIG |
warbyparker-gladly-daily.yaml |
All 4 sources changed to batch_replace |
legacy_elt.py.j2 |
Not modified — continues using truncate_insert from base config |
Consequences¶
Positive¶
dbt --full-refreshon Silver models rebuilds correctly from full Bronze history- Idempotent: re-running the same batch date produces identical results
- Backward compatible: default
truncate_insertpreserves legacy pipeline behavior
Negative¶
- Bronze tables grow over time (mitigated by future retention policy)
- DELETE is slightly slower than TRUNCATE (mitigated by future CLUSTER BY BATCH_ID)
Future Follow-ups¶
- Bronze clustering:
ALTER TABLE ... CLUSTER BY (BATCH_ID)for efficient deletes - Bronze retention: Snowflake scheduled task to purge records older than 180 days
- Deprecate
truncate_insertfor newsnowflake_loadpipelines