The character set encoding has to be detected by the caller.
So BOM character sequences (Byte Order Mark U+FEFF in UTF-16)
should be ignored. Currently this is not the case.
The caller has to pass an XML without leading BOM characters down to textwolf.
Because the client has to detect the character set encoding anyway, skipping BOM
before passing th content to textwolf should be doable without big effort.