libxml and german umlaute
BlitzMax Forums/Brucey's Modules/libxml and german umlaute
| ||
Hi, I got some strange problems with reading german umlaute from xml files. It doesn't matter with encoding I use (german or utf8). Currently I use <?xml version="1.0" encoding="UTF8" ?> at top of all my files. Very strange is that it workes from one file but all other files don't work. I'm reading the text with the getText Method from the TXmlNode class. Here is a sample of the working file (Maybe you can't see the correct german umlaute depending on your system): <?xml version="1.0" encoding="UTF8" ?> <items> <item id="6"> <name>Parfüm</name> <image>parfuem.png</image> <script> char.say("me","Warum schleppe ich dieses stinkende Zeug eigentlich immer noch mit mir rum?") </script> </item> </items> Sample from a file that doesn't work: <?xml version="1.0" encoding="UTF8" ?> <texts> <general> <text id="1">Many Umlaute: öäüÖÄÜ</text> </general> </texts> I'm using BM 1.32. Any ideas? Maybe there is a way to use the HTML tags for umlaute? (ü, ä ...) |
| ||
It depends if your file really is UTF-8 or not. To begin with, you need to call it "UTF-8". Then you need to ensure that those characters are stored in the file as UTF-8. That will depend on what editor you are using to save the file. I'm not sure if you can use HTML encoding, but you can use Unicode encoding, which I think for Ö, is �D6; BlitzMax internally uses 2-byte characters, and I think by default saves files as ISO-8859-1. If you create Strings in BlitzMax and store them with libxml, they will be correct. If you save a standard file with BlitzMax, it won't be UTF-8. If your editor supports correct file encoding, it should not be a problem. |
| ||
�D6; dont work. My xml files are written with Textmate. It supports UTF-8 saving. It's the default anyway. |
| ||
Hmm... must be int value, not hex then... maybe Ö |
| ||
I'm not sure what your umlaut problem is, exactly ? Can you explain exactly where you are having an issue with the text? I pasted your example into BBEdit, corrected the encoding attribute, saved as UTF-8, and loaded it into this : SuperStrict Framework bah.libxml Import brl.standardio Import brl.system Local doc:TxmlDoc = TxmlDoc.parseFile("test.xml") Local root:TxmlNode = doc.getRootElement() Local children:TList = root.getChildren() For Local general:TxmlNode = EachIn children For Local text:TxmlNode = EachIn general.getChildren() Print text.getText() Notify text.getText() Next Next If you are outputting to the console (in the IDE), the Print will appear corrupt. No idea why.. I guess it has issues... Notify converts the text properly (to UTF8) and so displays it fine. MaxGUI should also be able to display text fine, as well as wxMax. The graphics modes should be okay too... |
| ||
Try to wrap it into cdata tags. Maybe this will help you. |
| ||
Now I'm using the following codes and it workes fine (&#<CODE>;) Ä 196 Ö 214 Ü 220 ä 228 ö 246 ü 252 ß 223 Example Ä ; Makes Ä (without the space before the ; ) |