Hi,
Thanks for sharing this nice work. Can you share the datasets and code used for the tasks in Section 7 of the paper? As it is so dependent on cross-matching of data the results are very difficult to verify. Tables of the cross-matched sources at a minimum would be great in order to fetch the correct images/spectra, and any segmentation information as well.
Perhaps a separate issue, but do you envision that this model is limited only to operating with data from the surveys it was pre-trained on? Or would one have to train their own tokenizer in order to use different data?
Hi,
Thanks for sharing this nice work. Can you share the datasets and code used for the tasks in Section 7 of the paper? As it is so dependent on cross-matching of data the results are very difficult to verify. Tables of the cross-matched sources at a minimum would be great in order to fetch the correct images/spectra, and any segmentation information as well.
Perhaps a separate issue, but do you envision that this model is limited only to operating with data from the surveys it was pre-trained on? Or would one have to train their own tokenizer in order to use different data?