Skip to content

Materializations Demo

This example shows how different materializations (view, table, incremental, ephemeral) behave in FastFlowTransform.

The demo models are located under:

examples/materializations_demo/models/

````

Each model type demonstrates how FastFlowTransform builds, caches, or executes models differently depending on its `materialized:` configuration.

Supported engines: DuckDB, Postgres, Databricks/Spark, BigQuery (pandas & BigFrames), and Snowflake Snowpark.

---

## 🧩 1. View Models

A **view** model is always re-created from scratch each run.  
It defines a virtual relation that doesn’t store data permanently — ideal for lightweight transformations.

```sql
{{ config(materialized='view') }}

select
    order_id,
    customer_id,
    total_amount,
    order_date
from {{ ref('stg_orders') }}
````

**Characteristics**

* Rebuilt each run (no persisted data)
* Useful for staging, joins, and intermediate logic
* Fast and always up-to-date with upstreams
* Cannot store or cache incremental state

---

## 🧱 2. Table Models

A **table** model materializes into a physical table on the target engine.

```sql
{{ config(materialized='table') }}

select *
from {{ ref('fct_orders_view') }}

Characteristics

  • Fully rebuilt every run
  • Good for final curated datasets or small tables
  • Overwrites previous contents (atomic replace)
  • Compatible with all engines (DuckDB, Postgres, Databricks, BigQuery, Snowflake)

⚡ 3. Incremental Models

An incremental model stores state and only updates changed records on subsequent runs.

{{ config(
    materialized='incremental',
    incremental={
        "enabled": true,
        "unique_key": "order_id",
        "updated_at_column": "updated_at",
        "delta_sql": "select * from {{ ref('stg_orders') }} where updated_at > (select max(updated_at) from {{ this }})"
    }
) }}

Characteristics

  • Persists data between runs
  • Only merges new or changed rows
  • Significantly faster for large tables
  • Requires unique_key and (optionally) an updated_at_column
  • Schema changes can be managed via:

  • on_schema_change: "ignore"

  • on_schema_change: "append_new_columns"
  • on_schema_change: "sync_all_columns"

Behavior example:

Run Operation Rows affected
1 full load 10,000
2 merge delta 120
3 merge delta 45

🧮 4. Ephemeral Models

An ephemeral model exists only during query compilation. It never creates a physical table or view — it’s inlined wherever it’s referenced.

{{ config(materialized='ephemeral') }}

select
    order_id,
    total_amount * 0.1 as tax_amount
from {{ ref('fct_orders_inc') }}

Characteristics

  • Inlined into parent queries
  • Reduces I/O overhead (no temporary tables)
  • Ideal for lightweight reusable SQL snippets
  • Not visible in the warehouse after execution

🔗 5. Combined Example DAG

In the demo, these models are connected as follows:

stg_orders
   ↓
fct_orders_view (view)
   ↓
fct_orders_tbl (table)
   ↓
fct_orders_inc (incremental)
   ↓
fct_orders_ephemeral (ephemeral)

This DAG demonstrates:

  • How data flows between materializations
  • Which ones persist or recompute data
  • How incremental models can feed downstream table or ephemeral models

🧭 When to Use Each Type

Materialization Persists? Performance Recommended Use Case
view ❌ No ⚡ Fast rebuild Intermediate or temporary transformations
table ✅ Yes ⚖️ Moderate Final outputs or smaller datasets
incremental ✅ Yes 🚀 High (on deltas) Large, frequently updated fact tables
ephemeral ❌ No ⚡ Fast inline Reusable SQL snippets or shared logic

🧠 Tips

  • You can set default materializations in project.yml under models.materialized.
  • Override per model using {{ config(materialized='...') }}.
  • For incremental models, ensure unique keys and delta logic are consistent across runs.
  • Test behavior locally using the DuckDB engine before deploying to a warehouse.