How to Covert Unicode to UTF8?
BlitzMax Forums/BlitzMax Programming/How to Covert Unicode to UTF8?
| ||
finally went back after some month break, due to some personal acttacks against me in one of my old threads, even I got a simple workaround a BLitzMax bug..... I have this code to convert UTF8 back to unicode: Function Unicode$(UTF8$) Local RESULT$="" Local UTF$="" Local Length=0 Local Last$="" For Local i=1 To Len(UTF8$) Local Char$=Mid$(UTF8$,i,1) Local B$=Right$(Bin$(Asc(Char$)),8) If Length>0 UTF$=UTF$+Right$(B$,6) Length:-1 If Left$(B$,2)<>"10" Length=0 RESULT$=RESULT$+Last$ Last$="" ElseIf Length=0 RESULT$=RESULT$+Chr$(Bin2Int(UTF$)) EndIf EndIf If Length=0 If Left$(B$,1)="0" Result$=Result$+Char$; Length=0 ElseIf Left$(B$,3)="110" Last$=CHAR$ UTF$=Right$(B$,5) Length=1 ElseIf Left$(B$,4)="1110" UTF$=Right$(B$,5) Length=2 ElseIf Left$(B$,4)="11110" UTF$=Right$(B$,5) Length=3 EndIf EndIf Next Return RESULT$ EndFunction Function Bin2Int:Int(Binary$) Local result=0 Local D=1 For Local i=Len(Binary$) To 1 Step -1 If Mid$(Binary$,i,1)="1" Then result=result+d D=D+D Next Return result EndFunction I found some Visual Basic 6 (as I dosent have) native code, that can covert unicode 2 utf8, but Im are not sure how the \ 6 works in Visual Basic 6 i commented out? Function UTF8_Encode$(Value$) Local result$ For Local i = 1 To Len(value$) Local char = Asc(Mid(Value$, i, 1)) If char < 128 Result$ = Result$ + Mid(value$, i, 1) ElseIf ((char > 127) And (char < 2048)) ' Result$ = Result$ + Chr$(((char \ 64) Or 192)) Result$ = Result$ + Chr$(((char And 63) Or 128)) Else ' Result$ = Result$ + Chr$(((char \ 144) Or 234)) ' Result$ = Result$ + Chr$((((char \ 64) And 63) Or 128)) Result$ = Result$ + Chr$(((char And 63) Or 128)) EndIf Next Return result$ EndFunction Original VB6 code (under comments, not the article self) is here: http://www.nonhostile.com/howto-convert-byte-array-utf8-string-vb6.asp NB. Can somebody fix the spelling in the title I cant fix (covert -> convert)? |
| ||
A lot of my modules do UTF-8 to Max String conversions (and back). Have a look at libxml/gtkmaxgui/etc mods for examples. wxMax lets wxWidgets take the stress of string conversion, which was nice for a change :-) |
| ||
I must want a native way, since I are really close above. BTW I do not use MaxGUI and need doing that before sending to some dlls files (A plugin system for Jukebox software). Hence I want some small native code. Is a module and example I should look to? Here is the usable code and I put that one in the Code archives: Function UTF8$(Unicode$) Local RESULT$="" For Local i=1 To Len(Unicode$) Local Char$=Mid$(Unicode$,i,1) If Asc(Char$)<128 result$=result$+char$ ElseIf Asc(Char$)>127 And Asc(Char$)<2048 Local Bytes$=Right$(Bin$(Asc(Char$)),11) Local Byte1$="110"+Left$(Bytes$,5) Result$=Result$+Chr(Bin2Int(Byte1$)) Local Byte2$="10"+Right$(Bytes$, 6) Result$=Result$+Chr(Bin2Int(Byte2$)) Else Local Bytes$=Right$(Bin$(Asc(Char$)),16) Local Byte1$="1110"+Left$(Bytes$,4) Result$=Result$+Chr(Bin2Int(Byte1$)) Local Byte2$="10"+Mid$(Bytes$, 5, 6) Result$=Result$+Chr(Bin2Int(Byte2$)) Local Byte3$="10"+Right$(Bytes$, 6) Result$=Result$+Chr(Bin2Int(Byte3$)) EndIf Next Return Result$ EndFunction I know it can been faster, but it (should) work. Please note: Bin2Int is needed which can see in the UTF8 2 Unicode example (the other way). EDIT: hey, I should have looked in Code achives first, since Junkprogger allready have created such a function, that appeared recently. |