MDEV-39834: Add TidesDB 9 storage engine#5166
Conversation
The libtidesdb C library is vendored under storage/tidesdb/libtidesdb/ and compiled into a static archive that is statically linked into the loadable plugin module, mirroring storage/rocksdb, the engine is self-contained and needs no system-installed libtidesdb. include/providers is removed from the engine's include dirs so the bundled zstd/lz4/snappy link directly instead of routing through MariaDB's compression-provider plugins. For cross-version portability, spatial- and fulltext-index detection uses the HA_SPATIAL / HA_FULLTEXT key flags (with a HA_SPATIAL/HA_FULLTEXT -> *_legacy rename shim) rather than KEY::algorithm, which older servers leave unset; my_global.h is included before handler.h so the server typedefs it needs are defined. Adds the tidesdb mysql-test suite covering CRUD, MVCC and pessimistic locking, transactions/savepoints, isolation levels, compression, encryption, TTL, full-text, spatial, partitioning, object-store offload and crash/durability. The vector test auto-skips where the VECTOR type is unavailable (before 11.7). Builds and full mysql-test suite shows green on MariaDB 11.4 (LTS) through to 13.0.2
There was a problem hiding this comment.
Code Review
This pull request integrates the TidesDB storage engine into MariaDB, adding the core handler implementation in ha_tidesdb.h, build configuration in CMakeLists.txt, vendored libtidesdb source files, and an extensive test suite. The reviewer provided several constructive suggestions, including moving the public header copy operation in CMakeLists.txt from configure time to build time to prevent stale headers during incremental builds. Additionally, the reviewer recommended using C++11 in-class member initializers for various member variables in ha_tidesdb.h to avoid uninitialized memory issues, and explicitly marking the ha_tidesdb destructor with override.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
|
Base (11.4) >> 11.5 >> 12.0 >> … >> 12.x >> 13.0 >> main Can propagate forward to every newer branch automatically. |
|
Full TideSQL reference: https://tidesdb.com/reference/tidesql/ |
|
I see Having a look as I'm gonna guess we need to pass that wf. |
|
C++11 is base I am going to assume, 13 seems to be C++17. I'll keep the engine at bay with C++11 as the library aligns with C11. |
Build fixes for MariaDB 11.4-11.7 (C++11; buildbot builds with -Werror): - Replace C++17 structured bindings with C++11 iterator access in the FTS write_row/update_row/delete_row and ft_init_ext paths. - Guard TABLE::part_info with WITH_PARTITION_STORAGE_ENGINE, which only defines that member when partitioning is compiled in. Initialization hardening (PR review feedback): - Value-initialize TidesDB_share encryption members (encrypted, encryption_key_id, encryption_key_version) in the constructor. - Mark ~ha_tidesdb() override.
|
Something to have a look at: https://buildbot.mariadb.org/#/builders/554/builds/22266 and https://buildbot.mariadb.org/#/builders/554/builds/22266/steps/3/logs/stdio not sure if CI or something I need to touch on at plugin level. |
|
Caught it, had to create mariadb-plugin-tidesdb.install |
|
https://buildbot.mariadb.org/#/builders/534/builds/39281/steps/6/logs/stdio buildbot/amd64-ubuntu-2204-debug-ps does not seem to be due to TidesDB implementation. v9.3.3 of TidesDB and v4.5.4 of TideSQL plugin implemented. Ready for review. |
Right, there is an open ticket MDEV-38195 about failures of the test |
gkodinov
left a comment
There was a problem hiding this comment.
Thank you for your contribution! This is a preliminary review.
I'm approving it as it is, as there're no major violations. Or, if there are any, they're in the competency of the final reviewer. Nevertheless, I'll mention what I see that can be improved.
First of all: please consider merging some of the commits. Ideally there should be one commit per distinct "feature". I'd guess that makes 1 in your case.
Next: this technically is a new feature. As such, I'd suggest considering rebasing to the main branch. Unless of course, there is some special consideration why the targed should be as is.
Finally, there should be a Jira for house-keeping. I'll start one for you, using the PR explanation. But this needs to be improved significantly I'd guess, until it evolves into a proper design specification that would help with the final review.
There's also one small thing I've noticed below.
Please stand by for the final review.
There was a problem hiding this comment.
I do not think you need this file.
There was a problem hiding this comment.
Thank you @gkodinov understood for 1. For 2, to be able to propagate forward to newer branches selectively or not and main with a floor of support for 11.4. For 3 understood, I appreciate the help with that and I can work off it from there.
There was a problem hiding this comment.
I do not think you need this file.
You're right it's a fossil.
There was a problem hiding this comment.
Thank you @gkodinov understood for 1. For 2, to be able to propagate forward to newer branches selectively or not and main with a floor of support for 11.4. For 3 understood, I appreciate the help with that and I can work off it from there.
Have you tried just plaining atop of the main branch and re-compiling. You're not changing anything in the server itself. So it might work just as well. You could maybe need to alter some of the calls back to the server, but I do not think it has changed that much. Then it's a matter of running the regression tests successfully (and debugging any failures).
I (and the final reviewer) can guide you through that.
There was a problem hiding this comment.
All conditionals are handled from 11.4 to 13.02, though verified sparingly there is a full list. Main sounds ok, if condenses is main, we can do that.
|
AFAIK 11.4 closed for the new features, why 11.4 not main? Especially taking into account required dependencies zstd, lz4, and snappy |
|
target_compile_options(tidesdb_embedded PRIVATE -w) Why all warning supressed in so big code (we have enough problems with warnings with existing things why add more) |
Well, I originally thought per getting it into a lower branch and merging upwards for visibility was a goal per discussion with foundation, I've brought up a few times. If the goal is newer versions only you guys can tell me and I can correct accordingly. TideSQL was written for 11.4 onwards to modern 13.02 in testing. Hitting many granular versions. We have a list if you require. These versions I've supported have conditionals a lot because things do move around quite a bit as you probably know. Thus we want I'm certain the same thing where a user and everyone is happy. Cheers |
Can you further explain? We build and develop with all sanitizers on and debug flags and running on 16 different platforms and x86 and x64 in continuous integration. Please if you have a moment look at the TidesDB repository and the TideSQL main repositories. If you have a concern specifically with this PRs build let me know how you'd prefer it because I am not following currently properly. The plugin itself I'm highly doubtful would have any warnings as we would catch them in engine testing and continuous integration and plugin testing and integration, but i cant remember right now why it's set the way it is, we can modify a compile option for inclusion? Is this what you're pointing out? Thank you kindly |
|
Hm so @sanja-byelkin do you prefer vendored thirdparty code to use not ignore compiler warnings? I can do Which is review friendly I believe. Do let me know, I shall make the change ASAP. Thank you for your review. Cheers. |
|
Upstream on the TidesDB library I am doing a scrub with what you've asked for to assure all is well. The flag you brought up is known to have false positives so I will take time with diligence there. |
Refresh the vendored libtidesdb sources under storage/tidesdb/libtidesdb to release 9.3.5 (the engine previously bundled 9.3.3). Release 9.3.5 also resolves the compiler warnings for variables that may be used uninitialized, so the bundled library now compiles clean under that GCC diagnostic. Compression backends are now optional. compress.c includes zstd, lz4 and snappy only when the matching TIDESDB_HAVE_ZSTD, TIDESDB_HAVE_LZ4 or TIDESDB_HAVE_SNAPPY macro is defined. The CMake build now autodetects each development library and enables the ones that are present, defining the TIDESDB_HAVE macros and linking each found library, and a build with none still works. This removes the earlier hard requirement on all three compression libraries. Sync ha_tidesdb.cc and the analyze, engine_status, index_stats and status_vars tests from the upstream tidesql sources. The full MTR suite passes on MariaDB 11.4 with 65 tests green and the vector test skipped where the VECTOR type is unavailable.
Per review, replace the blanket -w on the vendored libtidesdb target with targeted suppression of the three warning categories the third party C library trips under MariaDB's stricter C flags -Wdeclaration-after-statement (the library uses C99/C11 declare at use style while MariaDB compiles C as C90), -Wframe-larger-than (a few functions use large on stack buffers), and -Wdiscarded-qualifiers (one const qualifier drop). Any other warning still surfaces and, in maintainer mode, fails the build, so genuine regressions are no longer masked.
libtidesdb uses C11 stdatomic.h, which MSVC only enables under /experimental:c11atomics; without it the Windows build fails with C1189 "C atomic support is not enabled". Add an MSVC branch to the tidesdb_embedded target setting /std:c17 /experimental:c11atomics plus the same targeted /wd warning disables the library uses upstream (the Windows build compiles with /WX). The GCC and Clang warning handling is unchanged.
|
MSVC pthreads issue reviewing builtbot, we handle differently at library level. Investigating to see if another patch is required at library level to fit MariaDB. edit: yeah I'm gonna have to adjust compat layer a bit to comply. Working on that now. |
…ds backend Re-vendor the libtidesdb 9.3.6 sources. 9.3.6 adds a native Win32 threading backend in compat.h (SRWLOCK mutexes and rwlocks, CONDITION_VARIABLE condition variables, _beginthreadex threads, and fiber-local storage so thread-local destructors still run), so on MSVC it no longer includes pthread.h and needs no pthreads-win32 library. This fixes the MariaDB MSVC build, which has no pthreads-win32 and was failing with C1083 "Cannot open include file pthread.h". POSIX, macOS and MinGW keep using pthreads as before. Only compat.h, tidesdb.c and tidesdb.h change from 9.3.5; the C11 atomics flag added earlier (/experimental:c11atomics) is still required on MSVC.
decode_varint returns negative without writing its output when max_bytes is zero, which happens on a truncated or corrupt block, and several indexed read and iterator decode sites used the result without checking the return, so a handful of locals (ks vs vo seq_val vlog_off) could be read uninitialized. zero initialize those locals so the failure path yields a defined zero rather than garbage. also add a TIDESDB_WARN_MAYBE_UNINIT option, gcc only and off by default, that turns the warning on for an optimized dev build where the analysis actually runs. analyzed for MariaDB engine addition MariaDB/server#5166 review - correct an iterator scan miss under reaper eviction and stop the cross cf atomicity test flaking on commit backpressure iter_new pinned the level array once then skipped any sstable it could not immediately ref treating it as dead when in fact the reaper had it in a transient evicting window with the descriptor still live in the level. a reader iterator snapshots only once so that skip dropped a live sstable for the whole life of the scan and a key in it read back as not found. now we spin on try_ref while the level array is unchanged and only bail to retry when the array actually changed under us same shape as the point get reaper evict skip we already correctd. the cross cf atomicity test counted a transient busy commit as a hard error which tripped the errors assert on slow loaded ci boxes. commit through tdb_test_commit_with_retry like the sibling writers so a backpressure stall retries instead of failing the run.
Re-vendor the current libtidesdb sources, picking up the latest fixes made upstream to pass the Windows build. Changes are confined to btree.c, compat.h and tidesdb.c; the plugin and the test suite are unchanged.
Mirror libtidesdb's own MSVC flags by adding /wd4267 next to the existing /wd4244 narrowing-warning disable on the tidesdb_embedded target. Under the MariaDB Windows build's /WX, the library's size_t to int narrowings (e.g. "const int total = queue_size(...)") were promoted to C2220 errors. The library accepts this warning class (it already disables the sibling C4244), so disable C4267 as well and handles explicit checks where applicable. GCC, Clang and the Linux build are unaffected.
…fset Field::val_int_offset() takes a uint row offset, so cast the record-buffer offset to uint instead of my_ptrdiff_t. Under the MariaDB Windows build's /WX the my_ptrdiff_t to uint conversion was promoted to a C2220 error. The offset is record[1] - record[0] (rec_buff_length), which always fits in uint, so the cast is value-preserving. GCC and the Linux build are unaffected.
|
I just tried to build it on my system (Fedora 44).
Compilation options was
|
Working on that, OpenSSL dep was removed from library to play nicely, we were only using it on 2 methods for object store mode; we ended up writing the sha256, hmac algorithms as they were small enough, the next commit will pass.
Understood and no it's not default, its just available, similar to RocksDB. |
|
We discussed version issue. Final decision is not made yet. My opinion is that it is OK to make it in 11.4 but not OK to make it by default. becouase it will disturb distributives and user who build it from sources. Stable version should be stable. |
|
For clarity, "default" I mean build by default without enabling it through cmake options. |
|
No problem, thank you @sanja-byelkin |
|
@sanja-byelkin currently the plugin is built but not loaded by default. Reading the code i'll add DISABLED to the MYSQL_ADD_PLUGIN call then to actually build the plugin |
|
Before I do this though, will this affect testing TidesDB builds and MTR? Do let me know how you'd like to approach if possible so I can line next commit up properly. |
|
About tests, it would be way better if tests will be placed under storage/tidedb (see rocksdb, columnstore or connect for example). Regression test suite looks in mysql-test directories of the storage engines. Old tests placed in main mysql-test mostly historically. Also the other general observation (i have not looked deep yet), I see stress tests (at least by name) mixed with regression tests, it is better divide them in different suites, because one should be run always the other (again at least by name) something time consuming. Also just to remember that for really heavy long running tests (not stress tests just something huge) we have --big option and i can be checked inside such tests to avoid running them each time. It is not for stress testing just for really big tests, I mentioned it here to give full impression about our test suite abilities. |
No problem at all, will move into storage directory out of historical.
The 300+ tests are all short, their all primarily integration tests, I wouldn't consider real "stress" per se, we test isolation on concurrent load, autocommit stress, but stress mostly on concurrency. Naming can be refactored, I'll go through and assure naming describes the running tests a bit better. No problem for me. |
|
tidesdb_object_store.test - should check somehow that the host setup S3 (I do not have S3 setup on my computer but still want to run the test suite without S3, the same for buildbot builders (at least some). It would be nice to detect absence of S3 by asking server/plugin, in worst case by environment variable. For inspiration you can check mysq-test/include/have_* |
In TidesDB object store mode can be used with FS by default, so it's testing that path but yes variables for say minio, rust-fs, s3 could be valuable to configure for that specifically. |
|
An update about versions. We just now are accepting DuckDB, it will make a precedent and we will just follow with TideDB. It should not take more than several days and we have what to fix during this time :) |
|
Tide**s**DB, also understood. I saw the engine from the PLC, I understand. One thing, I do wanna state is on the repository side we are sticking to gamma as we have progressed through that hierarchy on our own side, though it may differ internally. I've brought up the version confusion with foundation as I went by what was in code, there was no stern documentation, I went with alpha, beta, preprod. After this, I'm sure we will have documentation on it!! :D |
The libtidesdb C library is vendored under storage/tidesdb/libtidesdb/ and compiled into a static archive that is statically linked into the loadable plugin module, mirroring storage/rocksdb, the engine is self-contained and needs no system-installed libtidesdb. include/providers is removed from the
engine's include dirs so the bundled zstd/lz4/snappy link directly instead of routing through MariaDB's compression-provider plugins.
For cross-version portability, spatial- and fulltext-index detection uses the HA_SPATIAL / HA_FULLTEXT key flags (with a HA_SPATIAL/HA_FULLTEXT -> *_legacy rename shim) rather than KEY::algorithm, which older servers leave
unset; my_global.h is included before handler.h so the server typedefs it needs are defined.
Adds the tidesdb mysql-test suite covering CRUD, MVCC and pessimistic locking, transactions/savepoints, isolation levels, compression, encryption, TTL, full-text, spatial, partitioning, object-store offload and crash/durability. The vector test auto-skips where the VECTOR type is unavailable (before 11.7).
Builds and full mysql-test suite shows green on MariaDB 11.4 (LTS) through to 13.0.2