Skip to content

docs: Add MongoDB data-source and offline-store reference documentation#6351

Merged
ntkathole merged 11 commits into
feast-dev:masterfrom
jvincent-mongodb:master
May 1, 2026
Merged

docs: Add MongoDB data-source and offline-store reference documentation#6351
ntkathole merged 11 commits into
feast-dev:masterfrom
jvincent-mongodb:master

Conversation

@jvincent-mongodb
Copy link
Copy Markdown
Contributor

@jvincent-mongodb jvincent-mongodb commented Apr 30, 2026

What this PR does / why we need it:

Adds reference documentation for the MongoDB contrib data source and offline store integration. This fills the documentation gap for the MongoDB offline store that was added in #6138.

Changes:

  • New docs/reference/data-sources/mongodb.md — documents MongoDBSource, configuration, supported types
  • New docs/reference/offline-stores/mongodb.md — documents MongoDBOfflineStore, data model, retrieval semantics, materialization, and known limitations
  • Updated docs/SUMMARY.md and docs/reference/data-sources/README.md to include MongoDB in the navigation

Which issue(s) this PR fixes:

Follows up on #6138 (feat: MongoDB offline store)

Checks

  • I've made sure the tests are passing.
  • My commits are signed off (git commit -s)
  • My PR title follows conventional commits format

Testing Strategy

  • Testing is not required for this change

This is a docs-only change (no code changes).

Misc

Documentation covers:

  • Data source configuration and supported types
  • Offline store YAML configuration example
  • Data model (shared feature_history collection design)
  • Retrieval semantics and point-in-time join behavior
  • Materialization flow
  • Known limitations

Open in Devin Review

Signed-off-by: jvincent-mongodb <jeffrey.vincent@mongodb.com>
Signed-off-by: jvincent-mongodb <jeffrey.vincent@mongodb.com>
@jvincent-mongodb jvincent-mongodb requested a review from a team as a code owner April 30, 2026 17:23
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

…tion

Signed-off-by: jvincent-mongodb <jeffrey.vincent@mongodb.com>
Signed-off-by: jvincent-mongodb <jeffrey.vincent@mongodb.com>
Signed-off-by: jvincent-mongodb <jeffrey.vincent@mongodb.com>
Signed-off-by: jvincent-mongodb <jeffrey.vincent@mongodb.com>
Copy link
Copy Markdown
Contributor

@caseyclements caseyclements left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Signed-off-by: jvincent-mongodb <jeffrey.vincent@mongodb.com>
@bisht2050
Copy link
Copy Markdown
Contributor

@ntkathole - This is ready for your review/merge. TIA!

Comment thread docs/reference/data-sources/mongodb.md Outdated
Use `retrieve_online_documents_v2()` to perform similarity search:

```python
results = FeatureStore.store.retrieve_online_documents_v2(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine for example, but user copying same block might face issue.

store = FeatureStore(repo_path=".")
results = store.retrieve_online_documents_v2(...)


## Key Optimizations

* **K-collapse**: Multiple FeatureViews that share the same join keys are queried in a single aggregation using `feature_view: {$in: [...]}`, reducing round trips.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual get_historical_features implementation loops over each proj_name in fv_to_features and issues separate MongoDB aggregation pipelines per feature view.

K-collapse can be enhancements?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. K-collapse was removed after the complication around projection keys was discovered. Two projections, same feature view. Thank you for pointing this out. I may have to change docstrings too.

@ntkathole
Copy link
Copy Markdown
Member

@jvincent-mongodb Also want to include Data Source and offline store in feast/README.md ?

Comment thread docs/reference/offline-stores/mongodb.md Outdated
Comment thread docs/reference/offline-stores/mongodb.md Outdated
jvincent-mongodb and others added 3 commits May 1, 2026 07:18
@bisht2050
Copy link
Copy Markdown
Contributor

Hi @ntkathole! Could you please take another look at this?

@ntkathole
Copy link
Copy Markdown
Member

@jvincent-mongodb Let's update docs/roadmap.md to fix the CI and will merge.

Signed-off-by: jvincent-mongodb <jeffrey.vincent@mongodb.com>
@ntkathole ntkathole merged commit 3eccc2e into feast-dev:master May 1, 2026
26 of 28 checks passed
Comment thread README.md

## 📦 Functionality and Roadmap

The list below contains the functionality that contributors are planning to develop for Feast.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ntkathole This is a strange line. Isn't the list below functionality that contributors HAVE developed for Feast?


The MongoDB online store supports [Atlas Vector Search](https://www.mongodb.com/docs/atlas/atlas-vector-search/), enabling similarity search over feature embeddings stored in MongoDB Atlas. This is powered by the `$vectorSearch` aggregation stage and requires MongoDB Atlas (or the `mongodb/mongodb-atlas-local` Docker image for local development).

See [PR #6344](https://github.com/feast-dev/feast/pull/6344) for full implementation details.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this line be included? It links back to code..

provider: local
online_store:
type: mongodb
connection_string: mongodb+srv://<user>:<pass>@cluster.mongodb.net
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll see in CI, but this may need # pragma: allowlist secret too.

top_k=5,
)

# Each result is a (event_timestamp, entity_key_proto, feature_dict) tuple.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result of FeatureStore's version is OnlineResponse. I'd just remove these lines.

Given the change, I suggest reviewing the comments below. They are about the implementation in MongoDB, which IS what they want to hear about (how it works), just not specific to the function above.

The MongoDB offline store provides support for reading [MongoDBSource](../data-sources/mongodb.md).
* Uses a single shared collection with a compound index for all FeatureViews, distinguished by a `feature_view` discriminator field.
* Entity dataframes can be provided as a Pandas dataframe. The offline store converts entity identifiers into serialized entity keys for efficient lookup against the collection.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skip the description.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants