Project Perfect Mod Forums
:: Home :: Get Hosted :: PPM FAQ :: Forum FAQ :: Privacy Policy :: Search :: Memberlist :: Usergroups :: Register :: Profile :: Log in to check your private messages :: Log in ::


The time now is Wed Apr 24, 2024 11:06 pm
All times are UTC + 0
Unicode for INIs
Moderators: Global Moderators, Red Alert 2 Moderators
Post new topic   Reply to topic Page 1 of 1 [8 Posts] Mark the topic unread ::  View previous topic :: View next topic
Author Message
Millennium
Commander


Joined: 09 Mar 2008
Location: Osaka (JP)/Hong Kong/Germany

PostPosted: Tue Dec 22, 2015 6:33 pm    Post subject:  Unicode for INIs Reply with quote  Mark this post and the followings unread

It appears INI files in RA2 are not using Unicode character encoding, but ANSI. Will the engine encounter any difficulties if I save an INI in Unicode (and with characters that are not part of the ANSI range)?

Perhaps this question's answer is to someone who has a even a little understanding of how software works - if so, please excuse me. I'm totally ignorant about it for the most part.

Thank you!

_________________
Mao Zedong wrote:

Our mission, unfinished, may take a thousand years.  

Back to top
View user's profile Send private message
G-E
Defense Minister


Joined: 09 Feb 2015

PostPosted: Tue Dec 22, 2015 7:53 pm    Post subject: Reply with quote  Mark this post and the followings unread

Technically unicode is supposed to be 2-byte per letter, but being a standard driven by Amrika, they made silly exceptions to allow a class called UTF-8, which for all intents is identical to the 1-byte ANSI/ASCII.

If you open Charmap in Winderos, it will give you the list of possible characters...

_________________
http://www.moddb.com/mods/scorched-earth-ra2-mod-with-smart-ai

Back to top
View user's profile Send private message
Graion Dilach
Defense Minister


Joined: 22 Nov 2010
Location: Iszkaszentgyorgy, Hungary

PostPosted: Wed Dec 23, 2015 10:21 am    Post subject: Reply with quote  Mark this post and the followings unread

G-E wrote:
Technically unicode is supposed to be 2-byte per letter


That unicode fell out of use not because American standards but because it couldn't solve the issues it meant to solve. The 65k characters weren't enough. UTF-8 still has unused ranges even today - while nowadays it gets extended with unused/dead languiages/glyphs even.

_________________
"If you didn't get angry and mad and frustrated, that means you don't care about the end result, and are doing something wrong." - Greg Kroah-Hartman
=======================
Past C&C projects: Attacque Supérior (2010-2019); Valiant Shades (2019-2021)
=======================
WeiDU mods: Random Graion Tweaks | Graion's Soundsets
Maintainance: Extra Expanded Enhanced Encounters! | BGEESpawn
Contributions: EE Fixpack | Enhanced Edition Trilogy | DSotSC (Trilogy) | UB_IWD | SotSC & a lot more...

Back to top
View user's profile Send private message Visit poster's website ModDB Profile ID
Askeladd
Light Infantry


Joined: 29 Dec 2013

PostPosted: Wed Dec 23, 2015 12:38 pm    Post subject: Reply with quote  Mark this post and the followings unread

Since UTF-8 is backwards compatible with ASCII it might actually work already, but it probably wouldn't be very useful given that the maximum number of bytes for an identifier is 24 or so (?) and the UTF characters you are probably interested in can cost up to 6 bytes.

Maybe instead you should think about a systematic naming scheme for your identifiers to minimize the number of characters. For instance:

[ATANK]
Primary=ATANKW1
Secondary=ATANKW2

[ATANKW1]
Projectile=ATANKPR

...and so on.

Back to top
View user's profile Send private message
Millennium
Commander


Joined: 09 Mar 2008
Location: Osaka (JP)/Hong Kong/Germany

PostPosted: Wed Dec 23, 2015 12:59 pm    Post subject: Reply with quote  Mark this post and the followings unread

Askeladd wrote:
Since UTF-8 is backwards compatible with ASCII it might actually work already, but it probably wouldn't be very useful given that the maximum number of bytes for an identifier is 24 or so (?) and the UTF characters you are probably interested in can cost up to 6 bytes.

Maybe instead you should think about a systematic naming scheme for your identifiers to minimize the number of characters. For instance:

[ATANK]
Primary=ATANKW1
Secondary=ATANKW2

[ATANKW1]
Projectile=ATANKPR

...and so on.


Using kanji in strings has been one idea that I wanted to explore (if for nothing else, then for NOSTR'ing UINames) in my search for a naming scheme.
I had no idea how the string storage space works or how character encoding relates to it, but this thread, for all its vitriol, has been informative.

Given the way the limitation works, I will probably explore other schemes.

_________________
Mao Zedong wrote:

Our mission, unfinished, may take a thousand years.  

Back to top
View user's profile Send private message
Bittah Commander
Defense Minister


Joined: 21 May 2003
Location: The Netherlands

PostPosted: Wed Dec 23, 2015 1:45 pm    Post subject: Reply with quote  Mark this post and the followings unread

To be honest it would've taken less time to just open Rules.ini, change the name of any unit or structure in the game to something with kanji characters and then confirm for yourself that it doesn't work than it took you to to make a topic to ask about it instead...

_________________

Last edited by Bittah Commander on Wed Dec 23, 2015 2:41 pm; edited 1 time in total

Back to top
View user's profile Send private message ModDB Profile ID YouTube User URL Facebook Profile URL
Blade
Cyborg Commando


Joined: 23 Dec 2003

PostPosted: Thu Dec 24, 2015 4:56 pm    Post subject: Reply with quote  Mark this post and the followings unread

Unicode is supposed to be 4 bytes per character (it was once thought that 2 bytes would be enough, but that was short sighted) and there are various ways of encoding the value into variable bytes per character encodings. By far the most widely used now is UTF-8 for document encoding as it maps the 7bit ASCII chars to the same values making any valid ASCII automatically valid UTF-8 and it also has no endian issues. Most WINAPI functions actually use UTF-16 internally for their unicode versions and there are a bunch of windows only macros for doing UTF-16 string literals and such. However I doubt that the internal ini parser uses the wide string functions since it all the ini files are currently ASCII which isn't valid UTF-16 AFAIK.

Back to top
View user's profile Send private message
RP
Commander


Joined: 12 Jul 2012
Location: Mapping God Heaven

PostPosted: Thu Dec 24, 2015 5:32 pm    Post subject: Reply with quote  Mark this post and the followings unread

NOSTR UIName does not support non-ASCII text as the tag is read as ASCII.

_________________


Mental Omega 3.0 Mission creator - Creator of FinalOmega: APYR 3.0 Map Editor

/ppm/'s stupidity

Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic Page 1 of 1 [8 Posts] Mark the topic unread ::  View previous topic :: View next topic
 
Share on TwitterShare on FacebookShare on Google+Share on DiggShare on RedditShare on PInterestShare on Del.icio.usShare on Stumble Upon
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © phpBB Group

[ Time: 2.8898s ][ Queries: 11 (2.7400s) ][ Debug on ]