Skip to content

feat: implement templating system based on tstrings#409

Open
NickCrews wants to merge 29 commits intoduckdb:mainfrom
NickCrews:t-strings
Open

feat: implement templating system based on tstrings#409
NickCrews wants to merge 29 commits intoduckdb:mainfrom
NickCrews:t-strings

Conversation

@NickCrews
Copy link
Copy Markdown
Contributor

@NickCrews NickCrews commented Mar 29, 2026

This is an implementation of #370. It is still a WIP, but it is now good enough that I think the shape of the API is 90% there. I wanted to open up this PR so @evertlammerts can take a look at it and give some high level comments. Once we iron the large scale things out then I can fix those up and do more polishing before I ask you for a more detailed review.

Summary

This PR implements a SQL templating system for duckdb-python based on Python t-strings (Discussion #370), while remaining usable from non-3.14 code paths.

The core idea is:

  • Interpolations default to bound parameters (safe by default).
  • Objects that can render SQL implement __duckdb_template__ and are expanded as SQL fragments/subqueries.
  • t-string conversions (!s, !r, !a) are treated as explicit raw interpolation (matching Python semantics).

For python 3.14+, a kitchen sink example:

conn = duckdb.connect()

class Users:
    def __init__(self, columns: list[str], active: bool):
        self.columns = columns
        self.active = active

    def __duckdb_template__(self, **kwargs):
        active = self.active
        columns = ", ".join(self.columns)
        # note the use of !s to indicate raw SQL
        return t"SELECT {columns!s} FROM users WHERE active = {active}"

active_users = conn.sql(Users(['name', 'age'], True))
query = duckdb.compile(t"SELECT * FROM ({active_users}) WHERE age > {duckdb.param(30, 'min_age', exact=True)}")
query.sql  # "SELECT * FROM (SELECT name, age FROM users WHERE active = $p0_active) WHERE age > $min_age"
query.params  # {"p0_active": True, "min_age": 30}

What’s included

1) New duckdb.template module

Introduces CompiledSql(sql: str, params: dict[str, object]) which is a simple dataclass to hold the compiled sql (with param names resolved).

To create this for python <3.14 , you will most likely use the template(*parts: str | Interpolationish | Param) to build a SqlTemplate object. This SqlTemplate is very analogous to the Template builtin to python 3.14. It is made of a sequence of alternating strs and Interpolations. This also supports evaluating any object that implements a __duckdb_template__() method, which allows for some nice usage patterns.

I keep this SqlTemplate a first class citizen so users can have programmatic access to the raw data structure, in case they want to get in the middle of the process. For example, the Interpolations in the SqlTemplate could themselves hold another SqlTemplate for arbitrary nesting! But most users skip straight to compiling to a CompiledSql. They do this in two steps:

  • SqlTemplate.resolve() takes this generic format and performs transformations to get it closer to the output format, returning a ResolvedSqlTemplate, which recursively resolves the Templates and Interpolations, so that you get a FLAT sequence of strs and Interpolations, and the Interpolations are guaranteed to hold Params.
  • Then ResolvedSqlTemplate.compile() actually resolves the final names for all the Params and generates the final CompiledSql

Behavior highlights:

  • Nested templates are resolved recursively.
  • Non-string interpolations become named DuckDB params.
  • Friendly param naming, either with exact param names, autogenerating names, or autogenerating with a suffix.
  • Duplicate param names are rejected.
  • params are always used as named params, never positional.

2) Query API integration (Python typing + runtime)

The main SQL entry points now accept template inputs (SqlTemplate / CompiledSql) in addition to existing string/statement inputs, including:

  • connection methods: sql, query, from_query, execute, executemany
  • module-level wrappers for the same operations
  • I'm not sure if we should be EVEN more coercive, and support accepting any of the things that template() accepts. Eg should we accept conn.sql(["SELECT * FROM users WHERE id = ", 123])?

3) C++ execution path support

Adds normalization in DuckDBPyConnection to handle template-like objects at runtime:

  • If object has .compile(), it is compiled before statement parsing.
  • If object has .sql/.params, those are consumed directly.
  • Compiled params are merged with user-provided params:
    • dict + dict merge supported
    • duplicate key collisions raise
    • mixing compiled named params with non-empty positional params raises

This allows passing compiled/template objects directly into execution/query paths without requiring manual .sql/.params unpacking by users.

4) Built-in object support for interpolation

Adds __duckdb_template__ implementations for key DuckDB Python objects so they compose naturally in templates:

  • DuckDBPyRelation -> SQL form (ToSQL)
  • DuckDBPyExpression -> expression SQL string
  • DuckDBPyType -> type string
  • I probably am missing other types.

This enables patterns like relation/type/expression interpolation directly inside templates/t-strings, eg template("SELECT 42.5::", duckdb.DOUBLE)

5) Tests

Adds broad coverage across:

  • pure-Python template internals (construction, parsing, resolution, compilation, naming, protocol behavior, nesting, conversion semantics)
  • Python 3.14 t-string-specific behavior. This module can only be parsed in python 3.14 so I need to wire up the harness to only attempt to load this test file on these new versions.
  • end-to-end integration with connection/module SQL APIs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant