Skip to content

Optimize post_bsos (postgres) #1930

@data-sync-user

Description

@data-sync-user

The initial version of post_bsos is a simple repeated call to put_bso. Ideally this should be the inverse (like in the Spanner impl): put_bso should be implemented as a call to post_bsos with a vec of a single bso.

This could be tricky on Postgres using the diesel ORM as the collection of bsos could be a mix of different selective updates.

E.g. it could be a request for bso0 to update solely its payload, and bso1 to update solely its sortindex. A plain upsert on multiple bsos would want to adjust all values set on ON CONFLICT DO UPDATE SET across all the bsos (in this case payload/sortindex on both).

The python impl handles this by grouping all bsos w/ the same field updates into separate batches of updates. This would result in 2 separate upserts in the example above, one for a payload only upsert, one for a sortindex only. This sounds difficult to emulate in the strongly typed diesel ORM.

The batch commit handles a similar situation via a COALESCE on the old vs new values, which might be a better option.

┆Issue is synchronized with this Jira Task

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions