fastflowtransform.table_formats.base¶
SparkFormatHandler ¶
Bases: ABC
Abstract base for Spark table format handlers (Delta, Parquet, Iceberg, ...).
Responsibilities
- Saving a DataFrame as a managed table.
- Incremental INSERT semantics.
- Optional incremental MERGE semantics (can raise NotImplementedError).
This is intentionally minimal so that engines (DatabricksSparkExecutor) can: - Delegate managed table handling to the handler. - Still implement engine-level fallbacks for merge semantics.
Source code in src/fastflowtransform/table_formats/base.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 | |
run_sql ¶
run_sql(sql)
Execute SQL via the injected runner (guardable in the executor).
Source code in src/fastflowtransform/table_formats/base.py
45 46 47 | |
qualify_identifier ¶
qualify_identifier(table_name, *, database=None)
Return the physical table identifier for Spark APIs (unquoted).
Source code in src/fastflowtransform/table_formats/base.py
50 51 52 | |
format_identifier_for_sql ¶
format_identifier_for_sql(table_name, *, database=None)
Return a SQL-safe identifier (per-part quoted) for the table.
Source code in src/fastflowtransform/table_formats/base.py
54 55 56 57 58 59 60 | |
allows_unmanaged_paths ¶
allows_unmanaged_paths()
Whether storage.path overrides should be honored for this format.
Source code in src/fastflowtransform/table_formats/base.py
69 70 71 | |
save_df_as_table
abstractmethod
¶
save_df_as_table(table_name, df)
Save the given DataFrame as a (managed) table.
The input name is the fully-qualified identifier Spark should use, e.g. "db.table" or just "table".
Source code in src/fastflowtransform/table_formats/base.py
86 87 88 89 90 91 92 93 94 | |
incremental_insert ¶
incremental_insert(table_name, select_body_sql)
Default incremental INSERT implementation, format-agnostic.
select_body_sql must be a SELECT-able body (no trailing semicolon),
e.g. "SELECT ... FROM ...".
Source code in src/fastflowtransform/table_formats/base.py
97 98 99 100 101 102 103 104 105 106 107 108 | |
incremental_merge ¶
incremental_merge(table_name, select_body_sql, unique_key)
Optional: incremental MERGE semantics (UPSERT-like). Subclasses may override this. Default: not supported.
Engines using this handler MUST be prepared to handle NotImplementedError and fall back to a more generic strategy.
Source code in src/fastflowtransform/table_formats/base.py
110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 | |