Skip to content

Parquet Visualizer Improvements #2184

@shopify-river

Description

@shopify-river

Parquet Visualizer Improvements

Collated feedback from Pete Luferenko on the Apache Parquet artifact preview feature.


Bugs

  • Scrollbar disappears when expanding to full-size view. Clicking the expand button causes the vertical scrollbar to vanish. The user has to scroll down past the visible content to find the scrollbar again.
  • Scroll wheel does not work in the default (non-fullscreen) view. Mouse/trackpad scrolling only functions after entering full-screen mode.
  • "See all" redirects to the raw Parquet binary instead of rendering all data in the viewer. Users expect "See all" to show the complete dataset inline.
  • Column headers scroll away. Column titles are not sticky — when scrolling vertically, the header row disappears, making it hard to identify which column data belongs to.
  • Parquet preview does not work for old runs whose artifacts have expired (returns 404 from GCS). There is no clear indication in the preview dialog that the artifact is unavailable due to retention policies.
    • Note: PR #2181 partially addresses this with improved artifact expiry messaging. Ensure the popup/dialog also communicates this clearly.

Enhancements

  • Load more data by default. Currently only the first 10 rows are shown. Instead of loading the entire dataset (which risks performance issues for very large files), determine a reasonable size threshold (e.g. 1 MB or 10 MB) and load up to that limit. Display a clear message when data is truncated.
  • Add pagination or progressive loading. There is currently no way to paginate through the dataset or see anything beyond the first k rows. Allow users to navigate through larger datasets page by page.
  • Add schema viewer. The Parquet viewer should display the Parquet file schema (column names, types, nullability, etc.) — not just the row data.
  • Support permalinks. There is no way to link directly to a specific artifact preview. Consider either:
    • Opening the Parquet viewer in a separate browser tab (preferred), or
    • Providing a shareable permalink/URL to the preview state.
  • Show clear unavailability messaging in the preview popup. When an artifact has expired or is otherwise unavailable, the popup should explain why (e.g. retention policy) rather than just failing silently.

Screenshots

Default view — scrollbar missing, "Showing first 10 rows" at bottom:

Image

Full-screen view — column header scrolls away, no pagination:

Image Image

Feedback reported by Pete Luferenko. Issue created by River on behalf of Camiel van Schoonhoven.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Nice to have

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions