Hooks¶
Hooks let you plug custom behavior into a FastFlowTransform run without changing your models. They’re mainly used for:
- Auditing (run & model audit tables)
- Notifications/logging
- Lightweight data quality checks
- Custom side-effects around model execution
This page explains:
- Lifecycle events you can hook into
- How to configure hooks in
project.yml - How to write Python hooks with
@fft_hook - How to write SQL hooks (inline & file-based)
- What context each hook receives
1. Lifecycle events¶
There are two scopes:
- Run-level hooks – fire once per
fft run - Model-level hooks – fire for each model (either globally via
selector directly on the model)
Run-level events¶
| Event | When it fires | Config key in project.yml |
Typical use |
|---|---|---|---|
on_run_start |
Right after the project loads, before any model runs | hooks.on_run_start |
Create audit tables, banners |
on_run_end |
After all models run & budgets are evaluated | hooks.on_run_end |
Final audit row, summary, alert |
Note:
on_run_endis invoked even if some models fail. The hook receives arun.statusof"success"or"error".
Model-level events¶
These are defined in project.yml under hooks: and applied to models matching a selector.
| Event | Meaning | Config key in project.yml |
|---|---|---|
before_model |
Right before a model starts (per matching model) | hooks.before_model |
after_model |
Right after a model finishes (per matching model) | hooks.after_model |
Under the hood, these are attached to the model’s meta and executed as “pre/post hooks” around the model.
2. Configuring hooks in project.yml¶
Hooks live under the top-level hooks: key in project.yml.
2.1 Run-level hooks¶
Each hook entry is a HookSpec with at least:
name: logical name of the hookkind:"sql"or"python"- optional
engines: list of engine names to restrict execution (e.g.["duckdb", "bigquery"]) - optional
sql: inline SQL body (for SQL hooks) - optional
params: extra free-form values passed to Python hook context
Example:
hooks:
on_run_start:
- name: create_audit_tables
kind: sql
sql: |
create table if not exists _ff_run_audit (
run_id text,
started_at timestamp,
finished_at timestamp,
status text,
env text,
engine text
);
- name: audit_run_start
kind: sql
sql: |
insert into _ff_run_audit (run_id, started_at, status, env, engine)
values (
{{ run.run_id | sql_literal }},
current_timestamp,
'running',
{{ run.env_name | sql_literal }},
{{ run.engine_name | sql_literal }}
);
- name: python_banner
kind: python
on_run_end:
- name: audit_run_end
kind: sql # SQL body lives in hooks/audit_run_end.sql
- name: python_summary
kind: python
Notes:
- For Python hooks,
namemust match the registration name in the decorator (see below). - For SQL hooks, the SQL can be inline (
sql:) or come from a.sqlfile inhooks/(see section 4).
2.2 Model-level hooks with select¶
Model hooks can be attached by selector, not by hard-coding them into each model.
This is what before_model and after_model are for:
hooks:
before_model:
- name: model_start_audit
kind: sql
select: "tag:example:hooks_demo"
sql: |
insert into _ff_model_audit (
run_id,
model_name,
event,
status,
started_at
)
values (
{{ run.run_id | sql_literal }},
{{ model.name | sql_literal }},
'start',
'running',
current_timestamp
);
after_model:
- name: model_end_audit
kind: sql
select: "tag:example:hooks_demo"
sql: |
update _ff_model_audit
set finished_at = current_timestamp,
status = 'success',
rows_affected = null,
elapsed_ms = null
where run_id = {{ run.run_id | sql_literal }}
and model_name = {{ model.name | sql_literal }}
and event = 'start';
- name: model_end_log_python
kind: python
select: "tag:scope:mart"
selectuses the same selector language asfft run(tag:…,model:…, etc.).- The hook is executed for each model that matches the selector.
3. Python hooks¶
Python hooks live in the project’s hooks/ directory (any subfolder), and are registered via the @fft_hook decorator.
3.1 Basic structure¶
Example hooks/notify.py:
from __future__ import annotations
from typing import Any
from fastflowtransform.hooks.registry import fft_hook
def _fmt_env(env: dict[str, Any]) -> str:
parts = []
for key in ("FF_ENGINE", "FFT_ACTIVE_ENV"):
if key in env:
parts.append(f"{key}={env[key]}")
return ", ".join(parts) if parts else "<no env>"
@fft_hook(name="python_banner", when="on_run_start")
def on_run_start(ctx: dict[str, Any]) -> None:
run = ctx.get("run", {})
env = ctx.get("env", {})
print(
f"[hooks_demo] on_run_start: run_id={run.get('run_id')} "
f"(env_name={run.get('env_name')}, engine={run.get('engine_name')}; {_fmt_env(env)})"
)
@fft_hook(name="python_summary", when="on_run_end")
def on_run_end(ctx: dict[str, Any]) -> None:
run = ctx.get("run", {})
stats = ctx.get("stats", {}) or {}
print(
"[hooks_demo] on_run_end: run_id=%s status=%s (built=%s, skipped=%s, failed=%s)"
% (
run.get("run_id"),
run.get("status"),
stats.get("models_built"),
stats.get("models_skipped"),
stats.get("models_failed"),
)
)
@fft_hook(name="model_end_log_python", when="after_model")
def on_model_end(ctx: dict[str, Any]) -> None:
run = ctx.get("run", {})
model = ctx.get("model", {})
print(
"[hooks_demo] on_model_end: run_id=%s model=%s"
% (run.get("run_id"), model.get("name"))
)
All Python files under
hooks/**.pyare loaded when the run starts; their@fft_hookdecorators populate the registry.
3.2 The @fft_hook decorator¶
@fft_hook(name="python_banner", when="on_run_start")
def on_run_start(ctx: dict[str, Any]) -> None:
...
-
nameLogical name; must matchproject.yml’snamefield for the hook. If omitted, defaults to the function name. -
whenLifecycle event this hook is for, e.g.: -
"on_run_start" "on_run_end""before_model""after_model"
Only these values are accepted; anything else will raise at registration time.
3.3 Python hook context¶
Python hooks always receive a single dictionary argument, the hook context.
Shape (simplified):
ctx: dict[str, Any] = {
"when": "on_run_start" | "on_run_end" | "before_model" | "after_model",
"run": {
"run_id": str, # unique run identifier
"env_name": str, # profile/env name (e.g. 'dev_duckdb')
"engine_name": str, # engine (e.g. 'duckdb')
"started_at": str, # ISO timestamp
"status": str | None, # on_run_end: 'success' | 'error'
"row_count": int | None,
"error": str | None,
},
"model": {
"name": str,
"path": pathlib.Path,
"tags": list[str],
"meta": dict[str, Any],
# (future extensions: status/rows/elapsed/error for model events)
} | None,
"env": dict[str, str], # env vars relevant to FFT (FF_* etc.)
# Only for on_run_end:
"stats": {
"models_built": int,
"models_failed": int,
"models_skipped": int,
} | None,
# Plus any extra keys from HookSpec.params:
# e.g. ctx["slack_channel"], ctx["threshold"], ...
}
So for simple hooks you can do:
run = ctx["run"]
if ctx["when"] == "on_run_end" and run["status"] == "error":
...
4. SQL hooks¶
SQL hooks are just Jinja-templated SQL statements that are executed via your target engine.
You can define them:
- Inline in
project.yml; or - In a
.sqlfile underhooks/, referenced by name.
4.1 Inline SQL¶
Inline SQL was shown in the examples above:
- name: audit_run_start
kind: sql
sql: |
insert into _ff_run_audit (run_id, started_at, status, env, engine)
values ({{ run.run_id }}, current_timestamp, 'running', {{ run.env_name }}, {{ run.engine_name }});
4.2 File-based SQL (hooks/**/*.sql)¶
If kind: sql has no sql: body, FFT will look for a .sql file:
- Root:
<project_dir>/hooks - Pattern:
hooks/**/<name>.sql nameis theHookSpec.namefromproject.yml
Example:
hooks:
on_run_end:
- name: audit_run_end
kind: sql # SQL body lives in hooks/audit_run_end.sql
File layout:
hooks/
audit_run_end.sql
model_start_audit.sql
audit/
complex_audit_for_marts.sql # name: complex_audit_for_marts
hooks/audit_run_end.sql:
-- examples/hooks_demo/hooks/audit_run_end.sql
-- Update the run-level audit row when the run finishes.
update _ff_run_audit
set
finished_at = current_timestamp,
status = 'success',
row_count = NULL,
error = NULL
where run_id = {{ run.run_id | sql_literal }};
If no matching file is found, the run fails with a clear error.
5. Jinja context for SQL hooks¶
SQL hook templates are rendered with:
run: run context (similar to Python hooks, but with some fields already converted to SQL literals)model: model context (for model-level hooks), orNonefor pure run hooksnode: alias ofmodel
5.1 run context¶
In SQL hooks:
- All
run.*fields are plain values in the Jinja context. - When inlining them into SQL, always pass them through
| sql_literalto get a safe SQL literal.
Example:
insert into _ff_run_audit (run_id, started_at, status, env, engine)
values ({{ run.run_id }}, current_timestamp, 'running', {{ run.env_name }}, {{ run.engine_name }});
5.2 model context¶
For model-level hooks:
{{ model.name }} -- logical model name ('events_clean.ff')
{{ model.path }} -- full filesystem path
{{ model.tags }} -- list of tags
{{ model.meta }} -- model meta dict from config(...)
You can use model or node – they’re the same object.
5.3 sql_literal filter¶
To safely inline values into SQL hook templates, use the sql_literal filter:
Noneor JinjaUndefined→NULLbool→TRUE/FALSEint/float→123,1.23str→ single-quoted with internal quotes escaped- Other types → JSON-dumped and then single-quoted
Examples:
where run_id = {{ run.run_id | sql_literal }};
set status = {{ model.status | sql_literal }};
set rows_affected = {{ model.rows_affected | sql_literal }};
This helps avoid syntax errors and SQL injection in generated hook SQL.
6. Error handling & logging¶
- If a hook (SQL or Python) raises an exception, the run fails with a message like:
Failed to execute on_run_end hook #1 for run: ...
- Hook execution is logged with the
[hooks]prefix, for example:
[FFT] [hooks] when=on_run_start node=<run>: executing 3 hook(s): sql:create_audit_tables, sql:audit_run_start, python:python_banner
[FFT] [hooks] when=on_run_start node=<run> hook#1 kind=sql name='create_audit_tables' – rendering SQL
[FFT] [hooks] when=on_run_start node=<run> hook#1 name='create_audit_tables' executing SQL:
...
[FFT] [hooks] when=on_run_end node=<run> hook#2 kind=python name='python_summary' – invoking python hook
This makes it easy to see which hooks were registered and exactly what SQL they ran.
7. Best practices¶
-
Keep hooks idempotent Especially SQL hooks: include
run_idandmodel_namein audit tables so reruns don’t break things. -
Scope hooks with
selectUse tags (tag:scope:mart,tag:example:hooks_demo) so hooks don’t run on every model. -
Be defensive in Python hooks Treat
ctxas a dict that may or may not have everything you expect (use.get()). -
Avoid heavy work in hooks Hooks run inside the main pipeline. Use them for auditing/logging/notifications, not for big ETL jobs.
-
Use
sql_literalin SQL hooks Whenever you inline values, go through| sql_literalinstead of crafting quotes by hand.