Skip to content

Add overwrite_file option to IMAP download_mail_attachments#67588

Open
Jualhosting wants to merge 7 commits into
apache:mainfrom
Jualhosting:fix-imap-overwrite
Open

Add overwrite_file option to IMAP download_mail_attachments#67588
Jualhosting wants to merge 7 commits into
apache:mainfrom
Jualhosting:fix-imap-overwrite

Conversation

@Jualhosting
Copy link
Copy Markdown

@Jualhosting Jualhosting commented May 27, 2026

Description

This PR adds a new boolean argument overwrite_file to ImapHook.download_mail_attachments.

Previously, download_mail_attachments would always overwrite local files if a downloaded attachment had the same filename as an existing file in the local output directory. This is problematic when downloading multiple attachments with the same name from different emails, or when running tasks concurrently, as it leads to accidental data loss.

By setting overwrite_file=False, the hook will now safely preserve existing files by appending a numeric suffix to the filename (e.g., file_1.ext, file_2.ext). The file creation uses exclusive creation mode ("xb") to ensure the operation is perfectly atomic and safe from race conditions during concurrent task executions.

Testing

  • Unit tests added/updated to verify overwrite_file logic.
  • Verified that "xb" properly raises FileExistsError and loops to the next suffix.

Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)

Generated-by: Antigravity (Gemini-based AI coding assistant) following the guidelines


Copy link
Copy Markdown
Contributor

@SameerMesiah97 SameerMesiah97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall approach looks reasonable, but this is missing tests. I think we should cover the following scenarios:

  1. Overwriting an existing file
  2. Preserving an existing file when overwrite_file=False
  3. Suffix incrementing (file_1, file_2, etc.)
  4. Preserving file extensions so formats like .csv are not corrupted accidentally during renaming

CI needs to be triggered as well.

@Jualhosting
Copy link
Copy Markdown
Author

Thanks for the feedback! I have added unit tests covering both overwrite_file=True and overwrite_file=False scenarios. This includes testing the suffix incrementing logic to ensure existing files (and their extensions) are preserved properly. The new tests have been pushed and should trigger the CI.

Copy link
Copy Markdown
Contributor

@23tae 23tae left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, @Jualhosting!
I've left a minor inline comment regarding production robustness.

A quick note on PR guidelines:
As per the Airflow PR Quality Criteria, all PRs need a meaningful description. Could you please update the PR body to briefly explain why this feature is needed?

Comment thread providers/imap/src/airflow/providers/imap/hooks/imap.py Outdated
@Jualhosting
Copy link
Copy Markdown
Author

Done! I've updated the PR body with the explanation and implemented the atomic file creation using \xb\ mode as suggested. Thanks for catching that!

@23tae
Copy link
Copy Markdown
Contributor

23tae commented May 27, 2026

@Jualhosting It looks like your description was placed inside the HTML comment tags (<!-- -->) of the PR template, which keeps it hidden on the rendered page.

If you move your text outside of those comment tags, it will display correctly.

@Jualhosting
Copy link
Copy Markdown
Author

@23tae Ah, you are absolutely right! The CLI tool I used to update the description accidentally swallowed the line breaks and pushed it inside the template comments.

I've just updated the PR body again manually to remove the HTML comments and restore the proper formatting. It should be fully visible and readable now! Thanks for the heads-up! 🍻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants