Unicode testers required - apply within

BlitzMax Forums/BlitzMax Module Tweaks/Unicode testers required - apply within

Brucey(Posted 2008) [#1]
Since we all know that BlitzMax doesn't like non-English file systems, I thought I'd have a go at fixing the low-level file access stuff so that it was Unicode/UTF-8 friendly.

The following is very much beta/development-level/source-only/unofficial, so it's only really for those who are happy hacking their installations, know how to build modules, and know how to backup. For those that aren't, please leave the room now :-)

So I present here, FIVE modified core modules : Pub.stdc, BRL.AppStub, BRL.Blitz, BRL.FileSystem and BRL.System. Available in this zip (567kb).

Modules in pub.mod should be placed in pub.mod.
Modules in brl.mod should be placed in brl.mod.

Please backup these existing folders first. Please.


So, what do you get?
* Full unicode file support on ALL supported platforms :-)
* RequestFile and RequestDir have also been "upgraded".
* Win32 system dialogs (Confirm, Proceed etc) are also now unicode friendly.
* AppDir, AppFile and LaunchDir are now fixed.

Summary
My unit test suite passes on all 3 platforms, so I'm reasonably confident the changes are "good-to-go", but as is the case with all these things, I really need some proper field testing done. Once we're happy, I can maybe start pushing for it (or at least something equivalent) to be made official.
I think Unicode file support is important for BlitzMax.
I think people selling software to international customers might agree.


Anyway... feedback welcomed as always.

:o)


degac(Posted 2008) [#2]
Ok, I installed your 'patches', but honestly I need a 'guide' to see how this upgrade can work for me (working on an Italian Window XP).
I've found this page with some texts written in different languages.
If I do a copy & paste in MaxIDE - for some font - I get the same 'gliph'.

I suppose I need the correct font to see in a Confirm window the same thing, or not?

Or - classic - I missed something?

Or - as I'm using an italian system - I need to install a 'new language' too see different result?
Штқефддув ф туц дфтпгфпу - I mean 'Installed a new language', but nothing different (a part the fact I must switch to english/italiano to type in MaxIDE else I get some error by the compiler I presume)

edit: tested on Bmax 1.30 SVN rev.170

RequestFile "Рфддў цўкдв"



Brucey(Posted 2008) [#3]
:-)

The current issues concern non-ascii characters. Some of the characters >127 but < 255 will work okay without this patch, because they are within the byte-size range. It's possible that Italian words in general do not fall outside of a byte-range - I've no idea.

I suppose I need the correct font to see in a Confirm window the same thing, or not?

Depends what text you are pasting. Obviously, if you are trying to use Chinese glyhs and don't have Chinese support installed (like the fonts etc) you still won't see much. However, you can be reasonably happy that it should be displaying the correct character despite the fact it is not displaying properly. ;-)

In my tests I was using Cyrillic text (Bulgarian) which was unicode characters of 1000+ (imagine... ascii value of 1054 kind of thing). These take up two bytes. Previously for 1054, you would see '@' instead of the correct symbol. With this patch, the character displays as expected.

Here's a small test : dialogtest.zip.
It should read some UTF8 strings from a file, and display them properly in 3 dialogs.

I didn't install any extra fonts for this on my XP Pro... but I dunno if it just has more support by default or not.


degac(Posted 2008) [#4]
Ok - perfect it works.
As expected...I made an error during the mod installation (extracting with WinRAR directly from the zip creates a new sub-folder system.mod/system.mod ...).
I was still using the old version.

SuperStrict

Framework BRL.System
Import BRL.StandardIO

Global UNI_DIR:String
Global UNI_FILE:String
Global UNI_NEWDIR:String
Global UNI_NEWFILE:String

Local lines:String[] = ["&#1054;&#1088;&#1077;&#1093;&#1095;&#1077;&#1090;&#1072;&#1089;&#1083;&#1072;&#1076;&#1082;&#1080;.tx","&#1054;&#1088;&#1077;&#1093;&#1095;&#1077;&#1090;&#1072;&#1089;&#1083;&#1072;&#1076;&#1082;&#1080;.tx"]
' pre-convert to wide-16 unicode (from UTF8)
' file functions expect strings to be wide-16 (standard BlitzMax string format)

UNI_DIR=lines[0]
UNI_NEWDIR=lines[1]
UNI_NEWFILE=lines[1]

' some prompts
Notify "This is '" + UNI_DIR + "'"
Confirm "Do you really want this " + UNI_NEWDIR
Proceed "It's all gone a bit " + UNI_NEWFILE
RequestFile UNI_DIR

I have no error in handling my text; I dont' need to convert it.


Brucey(Posted 2008) [#5]
I have no error in handling my text; I dont' need to convert it.

No worries :-)

On Windows the default is text is wide chars, so you shouldn't have to. On Mac it is UTF-8.
Any UTF-8 text you were to handle would need to be converted before using in Max Strings - on all platforms.


Brucey(Posted 2009) [#6]
Updated to the latest and greatest version. Also modified the original post to match.

This latest also fixes AppDir, AppFile and LaunchDir. :-)


degac(Posted 2009) [#7]
Just tested you last work. I recompile MaxIDE source to handle UNICODE files...

So with the new MaxIDE I tested the following basic example
SuperStrict

Framework BRL.System
Import BRL.StandardIO

Global UNI_DIR:String
Global UNI_FILE:String
Global UNI_NEWDIR:String
Global UNI_NEWFILE:String

Local lines:String[] = ["&#1054;&#1088;&#1077;&#1093;&#1095;&#1077;&#1090;&#1072;&#1089;&#1083;&#1072;&#1076;&#1082;&#1080;.tx","&#1054;&#1088;&#1077;&#1093;&#1095;&#1077;&#1090;&#1072;&#1089;&#1083;&#1072;&#1076;&#1082;&#1080;.tx"]

UNI_DIR=lines[0]
UNI_NEWDIR=lines[1]
UNI_NEWFILE=lines[1]

' some prompts
Notify "This is '" + UNI_DIR + "'"
Confirm "Do you really want this " + UNI_NEWDIR
Proceed "It's all gone a bit " + UNI_NEWFILE
Local fil$=RequestFile(UNI_DIR)
Notify fil$


Saved as TEST.bmx and a copy saved as &#1063;&#1095;&#1227;&#1228;.bmx (I don't know what I wrote...but I need some international words...)
Test.bmx - loaded and compiled as expected
&#1063;&#1095;&#1227;&#1228;.bmx - loaded BUT MaxIDE/compiler gives the following error
Building &#1115;&#1059;&#1091;&#1038;&#1118;
Unable to open source file 'C:/Documents and Settings/degac/Desktop/unicode_mods/[#C^.bmx'
Process complete

Any hints?
Of course &#1228; in the code box is a gliph in cirillic (well, I believe...)


Brucey(Posted 2009) [#8]
You'd probably need to rebuild ALL the tools to have unicode support... like bmk etc.

bcc (the bmx -> asm) compiler might have issues too.


One step at a time, I think ;-)


degac(Posted 2009) [#9]
Ok...you are right :D

PS: I'm messing up with the Max3d code and I found a problem with LoadString, maybe there's something to change for UTF-8 & Unicode support


Brucey(Posted 2009) [#10]
I found a problem with LoadString

Yes, you are possibly right.

The returned string would need to be converted... like this : bbStringFromUTF8String(LoadString(.....))

But I don't think you can build that in... since some people might want to use LoadString for loading binary data - into a String.