Skip to content

Add ShareClaw logistic regression label projection variants#32

Open
anubhav1004 wants to merge 2 commits intoopenproblems-bio:mainfrom
anubhav1004:codex/shareclaw-label-projection-methods
Open

Add ShareClaw logistic regression label projection variants#32
anubhav1004 wants to merge 2 commits intoopenproblems-bio:mainfrom
anubhav1004:codex/shareclaw-label-projection-methods

Conversation

@anubhav1004
Copy link
Copy Markdown

@anubhav1004 anubhav1004 commented Mar 24, 2026

Summary

  • add shareclaw_logreg_scaled, a scaled PCA logistic regression baseline
  • add shareclaw_logreg_balanced, a class-balanced scaled PCA logistic regression baseline
  • keep both methods lightweight, reproducible, and task-compatible

Validation

  • viash test src/methods/shareclaw_logreg_scaled/config.vsh.yaml
  • viash test src/methods/shareclaw_logreg_balanced/config.vsh.yaml

Small-resource notes
On the official resources_test datasets used locally on the VM:

  • shareclaw_logreg_scaled improved accuracy and f1_weighted over stock logistic_regression on cxg_immune_cell_atlas
  • shareclaw_logreg_balanced improved f1_macro over stock logistic_regression on cxg_immune_cell_atlas
  • all logistic regression variants achieved perfect scores on the small pancreas test resource

These methods are intentionally simple additions meant to provide useful lightweight baselines with different metric tradeoffs.

@anubhav1004
Copy link
Copy Markdown
Author

Added two stronger LDA variants to this branch after a ShareClaw-guided sweep over the official small task resources.

Current best small-resource candidate:

  • shareclaw_lda_svd: aggregate_macro_f1=0.700112 across cxg_immune_cell_atlas + pancreas
  • On cxg_immune_cell_atlas: accuracy=0.546012, f1_macro=0.400224, f1_weighted=0.562549

Current best Zebrafish local baseline:

  • shareclaw_lda_lsqr_auto: macro_f1=0.540053, accuracy=0.394122, balanced_accuracy=0.697012, weighted_f1=0.422995

Both new methods passed viash test on this branch.

@anubhav1004
Copy link
Copy Markdown
Author

Hi @rcannood — would love your review on this when you get a chance! The two logistic regression variants show improved accuracy/F1 over the stock baseline on cxg_immune_cell_atlas. Both methods pass viash test. Happy to address any feedback. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant