Unicode

BlitzMax Forums/BlitzMax Programming/Unicode

tin(Posted 2008) [#1]
is it possible to use Unicode font in blitzmax?


tonyg(Posted 2008) [#2]
This? and this? might help.
It is always worth using the search function.


Dreamora(Posted 2008) [#3]
Is it possible to use for draw etc: yes
Will blitz load any data from a path that has non-us letters in: no
Point of Unicode: discussable.


SebHoll(Posted 2008) [#4]
Dreamora:
That seems a bit odd - I've had a look and found a possible fix for Windows but I'm not sure how well it will work on Windows 98 PCs as they don't support unicode out of the box.

If anyone fancies testing it for me, please download and run this test app I just made: Unicode Test 3.



Drag n' drop a unicode file onto the window and see if it loads the text properly without crashing - this example seems to work fine here on my Vista PC.


Grisu(Posted 2008) [#5]
Can you make unicode menu's too?


SebHoll(Posted 2008) [#6]
Can you make unicode menu's too?

I assume you mean in MaxGUI - and the answer is, yes, if you are using the new MaxGUI.Win32MaxGUIEx driver.

The specific Unicode problem Dreamora was mentioning is more to do with the core BRL modules than MaxGUI.


Mark Tiffany(Posted 2008) [#7]
I think it works okay here on XP.

Can you make a test unicode file which all characters in to prove with this? I found a text file off google and some characters seem okay, but others don't, and am unsure if this is due to the font or the loading of the file? (I think it was the Ethiopian chars that didn't work, although the Thai ones did!)


degac(Posted 2008) [#8]
I found this with google

http://yudit.org/cgi-bin/test.cgi

I just copied (with mouse selection) in notepad and saved the .txt file in UNICODE format.

I dont' know if this is the *correct* way (it is not an ORIGINAL UNICODE file after all...) but the .exe runs without crashes.


SebHoll(Posted 2008) [#9]
Just to clarify - BlitzMax was always able to load unicode text from files - it just wasn't able to open files that had unicode names, e.g. see the file loaded in the screenshot - a few random Unicode characters.

The changes I made to Pub.Stdc that were linked into the test app should now allow you to load unicode file names.


Mark Tiffany(Posted 2008) [#10]
Oh, okay...

Well, just to confirm, I have loaded a file called:

رأيتها وكانت كالفراشةt.txt

that has the same text inside it successfully here on Windows XP.


Mark Tiffany(Posted 2008) [#11]
ha - didn't think that would post very well! It's a bunch of arabic characters fyi.


degac(Posted 2008) [#12]
Ah. Ok.
New test: I changed my file test.txt in many combination of different Unicode chars (Hindi+Greek+Arabic...) and the program works perfectly.


DavidDC(Posted 2008) [#13]
So I'm a little confused here :-)

Dreamora is correct in saying that the Blitzmax file system doesn't work with certain foreign (as in unicode) filenames and yet Seb steps in and says "here we go - it's working!"? His file link seems to be struggling for me, so I'm not sure what's going on?

Can the file system (and thus bmax as a whole) handle unicode character sets or not? Is Seb's fix potentially cross platform, or only on windows? Is it a hack or something more stable that might make it (soon) to a new BlitzMax release?

Sorry for being thick-headed...


SebHoll(Posted 2008) [#14]
Dreamora is correct in saying that the Blitzmax file system doesn't work with certain foreign (as in unicode) filenames

Yep - at present BlitzMax can store unicode strings internally but can't access files with unicode file names.

Seb steps in and says "here we go - it's working!"?

The problem lies with the standard C functions imported in Pub.Stdc - Microsoft Windows have different commands for handling unicode (wide character) strings and so these have to be used instead.

For example,

ANSI:
fopen( file$z, mode$z )
Windows Unicode Equivalent:
_wfopen( file$w, mode$w )
However, Mac OS X and Linux (I think) took a different tack - they allow Unicode file paths to be passed into the same old C functions that used to only handle ANSI characters, and they should be correctly distinguished between and handled by the OS. All that is needed in this case is to change the Extern declarations so that they pass wide-character (unicode) string ($w) instead of standard ANSI C strings ($z), e.g.

fopen( file$w, mode$w )
However we need to ensure that the older legacy Windows operating systems play nice with Unicode - Windows 95/98/Me were built for use with the older, more limited, ANSI character set and so Unicode support on these platforms is complicated - in addition, FAT and FAT32 (the most common file systems used with these versions of Windows) don't support Unicode filenames anyway!?!?

Can the file system (and thus bmax as a whole) handle unicode character sets or not?

Most modern file-systems can (for example, NTFS and OS X EFS, but not older file-systems such as FAT or FAT32). However, the BlitzMax modules need to be tweaked if unicode is to be supported by apps (regardless of whether Unicode is supported by the file-system or not).

Is it a hack or something more stable that might make it (soon) to a new BlitzMax release?

I've been experimenting with different approaches - the tweaks seems rock-solid on Windows 2000+, but Windows 9X is choking on it at the moment - it doesn't even like the new Unicode MaxGUI.Win32MaxGUIEx module for the same reason.

In the end it might end up being a toss-up between supporting Unicode, or supporting Windows 9X... Only Mr Sibly can make that decision, still, let's see how it goes though, hopefully I have a few more tricks up my sleeve yet.


Dreamora(Posted 2008) [#15]
Win98 SE and WinME support unicode if the additional package is installed. But they are officially abandoned so I don't think the download still exists at Microsoft.com
The package does not do much more than add some modified Win2k / XP libraries.
Don't know if this influences the filenames or not thought.

But thank you very much for the hint on where to fix this issue.
I don't give much on the selfcrashing OS generation (Win98 / ME) with their highly broken memory management.

It would be great if BM internally would use $w and add a a ?Win9X compiler directive to allow the addition of stone age methods for the stone age OS :)


Space Fractal(Posted 2008) [#16]
I asked the same for some month ago.

1. Unicode is no go for filelevel in Windows. Here you need to create a Pure Basic DLL (or in a other language) to doing that task (like I did, but I cannot share it, due to the Pure Basic license), if you want file level unicode support.

2. Unicode under file level seen to work very well in Linux, since the file system I used just use UTF8. Here you need to convert BlitzMax Unicode string to UTF8 and the otherway. There is serviel functions doing task in code archives.

3. I found using UTF8:: when reading and saving content to a stream works pretty well when using saved files from BlitzMax. Sometimes I do need to trim it.

4. Not sure how text document works, since they can been saved in various unicode format. But tested UTF8 based text files works best. Not 100% verifyfied.

5. So using unicode fonts is no problem, a least not in a graphics window, which I use in my app.


SebHoll(Posted 2008) [#17]
Win98 SE and WinME support unicode if the additional package is installed. But they are officially abandoned so I don't think the download still exists at Microsoft.com
The package does not do much more than add some modified Win2k / XP libraries.

Yep - that's the MSLU, it helps with the MaxGUI module, but it isn't the answer to all things Unicode by any means.

Also, it means that a unicows.dll will have to be distributed with compiled EXEs if they want them to run on Windows 9X/Me. In addition, the licensing terms for unicows.dll isn't actually that great - so perhaps the open source Opencow library could be used... Fortunately, either are compatible with libunicows which is an open-source MinGW library for MSLU implementation.

Don't know if this influences the filenames or not thought.

Herein lies another problem which I'm looking into now - for Unicode support, the Windows FileSystem module may have to be rewritten to use Windows API commands instead of C functions to maintain the compatability with Windows 9X/Me.

I don't give much on the selfcrashing OS generation (Win98 / ME) with their highly broken memory management.

Ditto - but a lot of people don't agree with us on that one.

It would be great if BM internally would use $w and add a a ?Win9X compiler directive to allow the addition of stone age methods for the stone age OS :)

That's effectively what my hack is - apart from it would be impractical to have a ?Win9X compiler directive (we don't want to change behaviour when the code is compiled on Windows 9X, instead we want it to run differently at runtime on Windows 9X, regardless of where it has been compiled ).

Just out of curiosity, how many people still use Windows 9X? The BlitzMax product page advertises support for Windows 98/ME - ho, hum!


DavidDC(Posted 2008) [#18]
Clarity, sweet clarity! Thanks heaps Seb :)


THANOS(Posted 2015) [#19]
Hmmmmm, english is a more common language, especially when it comes to programming language. I mean, who would create a programming language which uses Russian or Chinese instead of English? Noone. So, I suppose that, for my projects I will stay to ANSI instead of Unicode (even the name "cmd.exe" in written in ANSI, :P )