The rule is:
WORD ^1 : /\b([\p{L}\d]+)\b/;
Citizien = any( ... );
CitizenWord = any( WORD "Staatsangehöriger", WORD "Staatsangehörige" );
Person = sequence_imm( last = WORD, COMMA, first = WORD, COMMA, Citizen, CitizenWord, COMMA, WORD "in", wohnort = WORD, COMMA );
Extracting a word with 'René' gets:
first [133..134, 0|732 .. 0|736] 'Ren'
on the other hand if the diacritical character is in the middle or beginning:
wohnort [107..108, 0|565 .. 0|572] 'Zürich'
works.