inspire-harvester: add more transformation rules by jrcastro2 · Pull Request #507 · CERNDocumentServer/cds-rdm

jrcastro2 · 2025-06-17T14:52:08Z

No description provided.

jrcastro2 · 2025-06-17T15:00:56Z

site/cds_rdm/inspire_harvester/transform_entry.py

+        if degree_type:
+            result["type"] = degree_type
+        if institutions:
+            uni = institutions[0].get("name")


are we okay with this? any ohter ideas?

no, I would fail if there are more values than 1

jrcastro2 · 2025-06-17T15:01:36Z

site/cds_rdm/inspire_harvester/transform_entry.py

+
+        pub_infos = self.inspire_metadata.get("publication_info", [])
+        if pub_infos:
+            journal_cf = self._transform_journal(pub_infos[0])


there might be multple pub_infos ... is it okay to take the first one?

kpsherva · 2025-06-19T09:47:13Z

site/cds_rdm/inspire_harvester/transform_entry.py

+                if lang:
+                    trans_title["lang"] = lang


it would be a bit more readable if you assign all values, even if they default to None and clean the empty keys at the end, we will avoid nested if statements

kpsherva · 2025-06-19T09:50:35Z

site/cds_rdm/inspire_harvester/transform_entry.py

+        for translation in translations:
+            lang = translation.get("language")
+            title = translation.get("title")
+            subtitle = translation.get("subtitle")
+            if title:
+                trans_title = {"title": title, "type": {"id": "translated-title"}}
+                if lang:
+                    trans_title["lang"] = lang
+                rdm_additional_titles.append(trans_title)
+            if subtitle:
+                sub = {"title": subtitle, "type": {"id": "subtitle"}}
+                if lang:
+                    sub["lang"] = lang
+                rdm_additional_titles.append(sub)


Suggested change

for translation in translations:

lang = translation.get("language")

title = translation.get("title")

subtitle = translation.get("subtitle")

if title:

trans_title = {"title": title, "type": {"id": "translated-title"}}

if lang:

trans_title["lang"] = lang

rdm_additional_titles.append(trans_title)

if subtitle:

sub = {"title": subtitle, "type": {"id": "subtitle"}}

if lang:

sub["lang"] = lang

rdm_additional_titles.append(sub)

for translation in translations:

lang = translation.get("language")

title = translation.get("title")

subtitle = translation.get("subtitle")

type = None

if title:

type = "translated-title"

elif subtitle:

type = "subtitle"

else:

raise

additional_title = {"title": title or subtitle, "type": {"id": "translated-title"}, "lang": lang}

... remove none keys....

rdm_additional_titles.append(additional title)

not sure if it is more readable this way, so it is not a strong opinion,

kpsherva · 2025-06-19T09:53:36Z

site/cds_rdm/inspire_harvester/transform_entry.py

+            if not value:
+                continue
+
+            if schema in ["PACS", "CERN LIBRARY"]:


lets maybe leave a comment that it was agreed to drop these schemes.
Did you see maybe UDC schema in any of the records?

kpsherva · 2025-06-19T09:56:32Z

site/cds_rdm/inspire_harvester/transform_entry.py

+            if journal_record := info.get("journal_record"):
+                related_identifiers.append(
+                    {
+                        "identifier": journal_record,


this info about journal should also go to journal:journal custom field

kpsherva · 2025-06-19T09:56:56Z

site/cds_rdm/inspire_harvester/transform_entry.py

+                related_identifiers.append(
+                    {
+                        "identifier": parent_rep,
+                        "scheme": "cdsref",


I think it was

Suggested change

"scheme": "cdsref",

"scheme": "cds_ref",

inspire-harvester: add more transformation rules

8882b80

jrcastro2 force-pushed the codex/implement-field-mappings-for-transform_entry-function branch from dd20fcc to 8882b80 Compare June 17, 2025 14:57

jrcastro2 commented Jun 17, 2025

View reviewed changes

kpsherva reviewed Jun 19, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inspire-harvester: add more transformation rules#507

inspire-harvester: add more transformation rules#507
jrcastro2 wants to merge 1 commit intoCERNDocumentServer:masterfrom
jrcastro2:codex/implement-field-mappings-for-transform_entry-function

jrcastro2 commented Jun 17, 2025

Uh oh!

jrcastro2 Jun 17, 2025

Uh oh!

kpsherva Jun 19, 2025

Uh oh!

jrcastro2 Jun 17, 2025

Uh oh!

kpsherva Jun 19, 2025

Uh oh!

kpsherva Jun 19, 2025

Uh oh!

kpsherva Jun 19, 2025

Uh oh!

kpsherva Jun 19, 2025

Uh oh!

kpsherva Jun 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jrcastro2 commented Jun 17, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants