Skip to content

Conversation

@bsbodden
Copy link

Adds support for Redis JSON, Search (RediSearch/Redis Query Engine), and Vector Set commands.

JSON Commands

Implements all JSON commands:

  • JSON.GET, JSON.SET, JSON.DEL, JSON.MGET, JSON.MSET
  • JSON.ARRAPPEND, JSON.ARRINDEX, JSON.ARRINSERT, JSON.ARRLEN, JSON.ARRPOP, JSON.ARRTRIM
  • JSON.OBJKEYS, JSON.OBJLEN, JSON.STRLEN, JSON.STRAPPEND
  • JSON.NUMINCRBY, JSON.NUMMULTBY
  • JSON.TOGGLE, JSON.CLEAR, JSON.TYPE
  • JSON.RESP, JSON.DEBUG
  • Support for both legacy (.) and modern ($) JSONPath syntax
  • json_forget alias for json_del

Search Commands

Implements modular Search architecture with the following components:

Field Types:

  • TextField: full-text search with weights, phonetic matching, withsuffixtrie
  • NumericField: range queries with sortable option
  • TagField: exact-match filtering with case sensitivity and separators
  • GeoField: geospatial queries with radius search
  • VectorField: vector similarity with FLAT, HNSW, and SVS-VAMANA algorithms
  • GeoShapeField: polygon-based geospatial queries

Features:

  • Full-text search with stemming, phonetic matching, and stop words
  • Vector similarity search with multiple algorithms and distance metrics
  • Geospatial search with radius and polygon queries
  • Numeric and tag filtering
  • Aggregations with grouping, sorting, applying, and reducing
  • Hybrid search combining text and vector queries with score combination (RRF, linear)
  • Auto-complete/suggestions with fuzzy matching
  • Spell checking and synonym management
  • Query profiling and explain
  • IndexDefinition for fine-grained index control
  • Query parameters for dynamic queries
  • Multiple dialect support (1, 2, 3, 4)
  • Cursor-based pagination for large result sets
  • Score explanations and custom scorers
  • Highlighting and summarization

Vector Set Commands

Implements all Vector Set commands for Redis 8+:

  • VSET.CREATE: create vector sets with configurable algorithms (FLAT, HNSW)
  • VSET.ADD: add single or multiple vectors
  • VSET.DEL: delete vectors
  • VSET.GET: retrieve vectors by ID
  • VSET.SEARCH: K-nearest neighbor (KNN) search
  • VSET.RANGE: range queries with distance thresholds
  • VSET.INFO: get metadata and statistics
  • VSET.SCAN: iterate through vectors
  • VSET.CARD: get cardinality
  • VSET.DROP: delete vector set

Algorithm Support:

  • FLAT: brute-force exact search

  • HNSW: hierarchical navigable small world graphs for approximate search

  • Distance metrics: L2 (Euclidean), IP (Inner Product), COSINE
    Files: 8 files, 1,583 insertions, 9 deletions
    Tests: 34 tests, 289 assertions, 0 failures, 0 errors, 0 skips

  • Added 11 example files demonstrating usage

@bsbodden bsbodden force-pushed the redis/search_json_and_vsets branch from 1ad99a6 to c6b5e47 Compare December 21, 2025 00:21
Implement all JSON commands with complete feature parity to redis-py:
- JSON.GET, JSON.SET, JSON.DEL, JSON.MGET, JSON.MSET
- JSON.ARRAPPEND, JSON.ARRINDEX, JSON.ARRINSERT, JSON.ARRLEN, JSON.ARRPOP, JSON.ARRTRIM
- JSON.OBJKEYS, JSON.OBJLEN, JSON.STRLEN, JSON.STRAPPEND
- JSON.NUMINCRBY, JSON.NUMMULTBY
- JSON.TOGGLE, JSON.CLEAR, JSON.TYPE
- JSON.RESP, JSON.DEBUG
- Support for both legacy (.) and modern (\$) JSONPath syntax
- Add json_forget alias for json_del

Testing:
- Add comprehensive test suite with 97 tests covering all commands
- All edge cases and error conditions tested
- JSONPath query tests for both syntaxes
- Array and object manipulation tests
- Numeric operation tests

Documentation:
- Add json_tutorial.rb example demonstrating all features
- Add search_with_json.rb example showing JSON with Search
- Include practical examples for common use cases

Results: 97 tests, 283 assertions, 0 failures, 0 errors, 0 skips
Implement modular Search architecture with complete feature parity to redis-py:

Core Components:
- Schema and field definitions (TextField, NumericField, TagField, GeoField, VectorField, GeoShapeField)
- Query builder with fluent API and advanced query syntax
- Index management and operations (create, drop, alter, info)
- Aggregation framework with reducers and grouping
- Hybrid search combining text and vector queries
- Result parsing and formatting

Search Features:
- Full-text search with stemming, phonetic matching, and stop words
- Vector similarity search supporting FLAT, HNSW, and SVS-VAMANA algorithms
- Geospatial search with radius and polygon queries
- Numeric and tag filtering
- Aggregations with grouping, sorting, applying, and reducing
- Hybrid search with score combination (RRF, linear)
- Auto-complete/suggestions with fuzzy matching
- Spell checking and synonym management
- Query profiling and explain

Field Types:
- TextField: full-text search with weights, phonetic matching, withsuffixtrie
- NumericField: range queries with sortable option
- TagField: exact-match filtering with case sensitivity and separators
- GeoField: geospatial queries with radius search
- VectorField: vector similarity with multiple algorithms and distance metrics
- GeoShapeField: polygon-based geospatial queries

Advanced Features:
- IndexDefinition for fine-grained index control
- Query parameters for dynamic queries
- Multiple dialect support (1, 2, 3, 4)
- Cursor-based pagination for large result sets
- Score explanations and custom scorers
- Highlighting and summarization

Bug Fixes:
- Fix PARAMS handling to preserve binary vector data (don't call .to_s on values)
- Add DIALECT 2 requirement for KNN queries in Redis 8
- Add convenience API for SORTBY (:sort_by + :asc parameters)
- Fix field option ordering for proper Redis command syntax

Testing:
- Add comprehensive test suite with 44 tests across 8 test files
- Test coverage for all search features and edge cases
- Hybrid search tests with vector + text queries
- Aggregation tests with complex pipelines
- Vector similarity tests with multiple algorithms

Documentation:
- Add 6 example files demonstrating all features:
  - search_quickstart.rb: basic search operations
  - search_ft_queries.rb: advanced query syntax
  - search_aggregations.rb: aggregation pipelines
  - search_geo.rb: geospatial queries
  - search_range.rb: numeric range queries
  - search_vector_similarity.rb: vector search
  - search_with_hashes.rb: search with Redis hashes

Results: 44 tests, 215 assertions, 0 failures, 0 errors, 0 skips
@bsbodden bsbodden force-pushed the redis/search_json_and_vsets branch from c6b5e47 to f129e9d Compare December 21, 2025 00:34
Implement all Vector Set commands with complete feature parity to redis-py:

Vector Set Commands:
- VSET.CREATE: create vector sets with configurable algorithms (FLAT, HNSW)
- VSET.ADD: add single or multiple vectors to a set
- VSET.DEL: delete vectors from a set
- VSET.GET: retrieve vectors by ID
- VSET.SEARCH: K-nearest neighbor (KNN) search
- VSET.RANGE: range queries with distance thresholds
- VSET.INFO: get vector set metadata and statistics
- VSET.SCAN: iterate through vectors in a set
- VSET.CARD: get cardinality (number of vectors)
- VSET.DROP: delete entire vector set

Algorithm Support:
- FLAT: brute-force exact search
- HNSW: hierarchical navigable small world graphs for approximate search
- Configurable parameters: dimension, distance metric, initial capacity
- Distance metrics: L2 (Euclidean), IP (Inner Product), COSINE

Features:
- Batch operations for efficient bulk inserts
- Flexible vector encoding (binary blob format)
- Metadata and statistics tracking
- Range queries with distance thresholds
- Efficient KNN search with configurable K
- Vector set scanning and iteration

Testing:
- Add comprehensive test suite with 34 tests
- Test coverage for all vector set operations
- Algorithm-specific tests (FLAT, HNSW)
- Distance metric tests (L2, IP, COSINE)
- Batch operation tests
- Edge case and error condition tests

Documentation:
- Add vector_set_tutorial.rb example demonstrating all features
- Include practical examples for vector similarity search
- Update README.md with Vector Set documentation
- Update CHANGELOG.md with all new features

Infrastructure:
- Integrate Vector Set module into lib/redis/commands.rb
- Update .rubocop.yml to exclude generated files
- Update test/helper.rb for improved test infrastructure
- Requires Redis 8.0+ with vectorset module

Results: 34 tests, 226-248 assertions, 0 failures, 0 errors, 0 skips

Total Project Results:
- JSON: 97 tests, 283 assertions
- Search: 44 tests, 215 assertions
- Vector Set: 34 tests, 226-248 assertions
- Combined: 175 tests, 724+ assertions, 0 failures, 0 errors, 0 skips
@bsbodden bsbodden force-pushed the redis/search_json_and_vsets branch from f129e9d to e67145f Compare December 21, 2025 00:48
@bsbodden bsbodden self-assigned this Dec 21, 2025
@bsbodden bsbodden requested review from byroot and uglide December 21, 2025 00:55
Copy link
Collaborator

@byroot byroot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a surface level review, because that PR is just way too big to go into the details. I wish you had at least split the JSON and Search in two PRs.

Comment on lines +24 to +25
redis.extend(Redis::Commands::JSON)
redis.extend(Redis::Commands::Search)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extending into an instance is awful for the VM performance, one should never do that outside of tests suites and perhaps singleton objects.

Just include these modules by default, that doesn't cost anything.

Comment on lines +41 to +43
include Search
include JSON
include VectorSet
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like they are included by default, so I don't see why the .extend in the examples.

Comment on lines +9 to +15
module JSONDecoder
module_function

def parse(source)
::JSON.parse(source, symbolize_names: true)
end
end
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a single call site. Hence the value of such helper is basically 0. I'd remove it and just call ::JSON.parse directly.

Comment on lines +134 to +135
rescue ::JSON::ParserError
value
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think rescueing the parse error and returning the JSON source instead of whatever was expected is a good idea. That will just make the caller crash later.

Just raise some sort of Redis::JsonParseError.

Comment on lines +125 to +132
case value
when String
result = JSONDecoder.parse(value)
result.is_a?(Array) && result.length == 1 ? result.first : result
when Array
value.map { |v| parse_json(v) }
else
value
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper is bad. Each command should know exactly if it got returned an Array or a single value.


options[:return] = query.instance_variable_get(:@return_fields)

options[:highlight] = query.instance_variable_get(:@highlight_options)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for instance_variable_get this is a massive smell, and is terrible for perf.

# Strip prefix from document IDs if a prefix is set
if @prefix
result[1..-1] = result[1..-1].map do |item|
item.is_a?(String) && item.start_with?("#{@prefix}:") ? item.sub("#{@prefix}:", "") : item
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
item.is_a?(String) && item.start_with?("#{@prefix}:") ? item.sub("#{@prefix}:", "") : item
item.is_a?(String) ? item.delete_prefix("#{@prefix}:") : item

Comment on lines +175 to +181
def or_(&block)
new_collection(:or, &block)
end

def and_(&block)
new_collection(:and, &block)
end
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def or_(&block)
new_collection(:or, &block)
end
def and_(&block)
new_collection(:and, &block)
end
def or(&block)
new_collection(:or, &block)
end
def and(&block)
new_collection(:and, &block)
end

There no problem with method matching keywords in ruby. something.and(1).or(3) is perfectly valid syntax.

Comment on lines +1 to +8
# frozen_string_literal: true

class Redis
module Commands
module Search
end
end
end
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# frozen_string_literal: true
class Redis
module Commands
module Search
end
end
end

Comment on lines +33 to +40
if vector.is_a?(String) && vector.encoding == Encoding::BINARY
# Binary FP32 blob
args << "FP32" << vector
elsif vector.is_a?(Array)
# VALUES format
args << "VALUES" << vector.length
args.concat(vector)
end
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If vector is neither a binary String, nor an Array, you just drop it on the floor?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants