Unicode for translations?

Started by Radiant, Mon 12/02/2018 21:57:19

Previous topic - Next topic

Radiant

Just to doublecheck: does the AGS translation code (i.e. the code that reads TRS files, turns them into TRA files, and displays those in-game) support Unicode? Assuming I'm using a TTF font that contains the requisite characters, of course. If so, would UTF-8 or plain unicode be preferable, or something else? When I try it it doesn't give error messages but doesn't appear to display the correct characters. Thanks!

Crimson Wizard

#1
Quote from: Radiant on Mon 12/02/2018 21:57:19
Just to doublecheck: does the AGS translation code (i.e. the code that reads TRS files, turns them into TRA files, and displays those in-game) support Unicode? Assuming I'm using a TTF font that contains the requisite characters, of course. If so, would UTF-8 or plain unicode be preferable, or something else? When I try it it doesn't give error messages but doesn't appear to display the correct characters. Thanks!

No, AGS engine does not support Unicode. It treats any strings it reads as a sequence of bytes (ANSI chars), and it does not display any errors, because unicode strings are just same strings, only containing unknown codes. Any complex unicode character consisting of 2, 3 or 4 bytes will be treated as 2, 3, or 4 separate letters.
How these letters are displayed is totally dependent on the font: if the font has anything for that code - it will display that symbol, if not, it will either skip it or display '?' (IIRC).

Radiant

Right, thanks. I suppose the issue is not so much that the translation code doesn't do unicode, but that the main code itself doesn't?

Crimson Wizard

#3
Quote from: Radiant on Mon 12/02/2018 22:07:34
Right, thanks. I suppose the issue is not so much that the translation code doesn't do unicode, but that the main code itself doesn't?

Sorry, yes, that's right, that's all because of the engine. I misread your first post thinking you are talking about engine reading translations.
The TRS/TRA mechanism is intentionally aligned with the engine. It converts everything to the current ANSI codepage when creating TRA file. If it meets something that cannot be converted 1:1 it will replace it with some other character from withing ANSI range of codes.

Same happens to any text you put in the editor: strings in scripts, object names, and so on.

Radiant

I was talking about engine reading translations, yes :) I figured adding unicode to translations would be pretty easy IF the main engine already supported unicode. But since it doesn't, never mind. We've already got some Russian and Japanese translations to work without unicode, after all.

SMF spam blocked by CleanTalk