Libxml Module

BlitzMax Forums/BlitzMax Programming/Libxml Module

Brucey(Posted 2006) [#1]
Thought I'd do something useful for a change, so am hereby announcing the release of my Libxml module.

You can download it HERE.

For more information on what libxml is, see the libxml website.

You might be wondering why we need yet another xml module?
Here are a few reasons why you might want to consider this one above the rest... ;-)
* It's really quick - no, really.
* Fully documented.
* Feature-rich API.
* Strong, validating parser.
* Support for namespaces, DTD, and a countless number of standards related to markup languages.
* Handles de/compression of xml files on the fly (using zlib)
...etc...

If you notice any issues, you can either email me directly (mail listed in my profile) or post questions / problems here on the forums.


assari(Posted 2006) [#2]
nice work Brucey.


altitudems(Posted 2006) [#3]
Great work as usual,

Any progress on GED? Are you working with matt on Project Studio IDE?

Thanks!


Difference(Posted 2006) [#4]
Great! I'm using Libxml2 now through LoadLibraryA(), so I'm looking forward to testing this.

When using the "libxml2.dll", I also have to include "zlib1.dll" and "iconv.dll" with the app. Can I assume this is never needed with your module? ( "zlib1.dll" and "iconv.dll" seems to be on some Windows (XPPro? ) versions by default.


N(Posted 2006) [#5]
Very nice work, Brucey. Will be nice to be able to use a tried and true library.

Thanks a lot :)


FlameDuck(Posted 2006) [#6]
Wonderful work mate. Now about that Eclipse plug-in...


Brucey(Posted 2006) [#7]
Peter, it uses pub.zlib module for zlib support. There are no external requirements, so it should build *as is* on all platforms.

XPath isn't working quite right yet, but it is consistently not working on all platforms, so I'm assuming it's something I've done with the implementation of it...

I intend to pad out the support for the rest of libxml at some point. Currently, I've got nearly all of the Tree API working, and some basics of the parser API. But, if you've ever looked into it before, there is a *lot* in there that you can use.

Ducky... I was never quite happy with some of the engine for the plugin, and I've been wanting to rewrite it really, I suppose, perhaps, it wants to be hosted somewhere that others might be able to assist with it to... know of any good team-type repositories? :-p


FlameDuck(Posted 2006) [#8]
Peter, it uses pub.zlib module for zlib support. There are no external requirements, so it should build *as is* on all platforms.
Well there goes MaXML.

I suppose, perhaps, it wants to be hosted somewhere that others might be able to assist with it to... know of any good team-type repositories? :-p
Well there's always sourceforge ofcourse. If you're not quite happy with that, I could perhaps setup a subversion/trac combo somewhere you're more comfortable with.


Difference(Posted 2006) [#9]
It's really great that it's cross platform, and no externals are needed.

Adding the below functions to the "Extern" section of libxml.bmx, because they are the ones I seem to use the most, everything seems to work as with the dll version.

Function xmlReaderForFile:Int(filename:Byte Ptr,encoding:Byte Ptr,options:Int)
Function xmlFreeTextReader(reader:Int) 
Function xmlTextReaderRead(reader:Int) 
Function xmlTextReaderConstName:Byte Ptr(reader:Int)
Function xmlTextReaderConstValue:Byte Ptr(reader:Int)
Function xmlTextReaderAttributeCount(reader:Int)
Function xmlTextReaderGetAttribute:Byte Ptr(reader:Int,name:Byte Ptr)
Function xmlTextReaderNodeType(reader:Int)
Function xmlTextReaderHasValue(reader:Int)
Function xmlTextReaderReadInnerXml:Byte Ptr(reader:Int)
Function xmlTextReaderReadOuterXml:Byte Ptr(reader:Int)


Function UTF8Toisolat1(out:Byte Ptr,outlen:Int Var ,in:Byte Ptr,inlen:Int Var )
Function isolat1ToUTF8(out:Byte Ptr,outlen:Int Var ,in:Byte Ptr,inlen:Int Var )



Brucey(Posted 2006) [#10]
That's cool...

I'll add a new type TxmlTextReader and implement all that functionality in it.

As I said already, there is so much in there that I think I've only touched the surface of its capabilities...

:-)


Haramanai(Posted 2006) [#11]
Great!
Is this part of cairo wrapper?


Brucey(Posted 2006) [#12]
heh... no.. I should have a version of that out in the next day or so...


Brucey(Posted 2006) [#13]
Have updated the libxml module to 1.03 now.

This includes support for the Text Reader API (50+ specific methods) - a streaming XML reader (rather than the tree API which loads everything into RAM).
You can access it through the TxmlTextReader type. Fully documented as usual, with its own short introductory tutorial.

Enjoy :-)


Difference(Posted 2006) [#14]
Exellent! Seems to work flawlessly with 1.20, - still testing though.
Thanks a lot for putting in the Text Reader API.


morszeck(Posted 2006) [#15]
Doesn't cairo work under MacOS? It tested, but the graphics output is not correct. But win32 isn't a problem.


Brucey(Posted 2006) [#16]
I haven't tracked down the display problem on Mac yet.

I think it must be something to do with the image data handling, since PDF export works flawlessly.
Perhaps it's an endian issue... :-(


morszeck(Posted 2006) [#17]
OK, have tested again. It's only a problem with the color: SetSourceRGB.

When set only Blue (1.0) or LightBlue(0.1), then is this correct.
When set only Green(1.0) or Red (1.0) or in Light(0.1), then is this incorrect, so nothing is indicated .

That's right, it's a probem of endian issue...


Brucey(Posted 2006) [#18]
Thanks.
I'll have a look into it :-)


morszeck(Posted 2006) [#19]
Well. When your tested example clip_image.bmx with orginal PF_BGRA8888, then is the output-image ok. Only when work with SetSourceRGB(..) , then it's all broken.

But, i can't see behind. Have looking everywhere in the code of cairo...

It's very nice an cool, when cairo running on my mac.


morszeck(Posted 2006) [#20]
I have tested again with setcolor:

 
SetSourceRGBA           ReadPixel
r   g   b   a           a   r   g   b
A0  00  00  FF          00  00  A0  FF
00  A0  00  FF          00  A0  00  FF
00  00  A0  FF          A0  00  00  FF
00  00  00  FF          00  00  00  FF
00  00  00  7F          00  00  00  7F
FF  FF  FF  FF          FF  FF  FF  FF
7F  7F  7F  7F          3F  3F  3F  7F
7F  7F  7F  FF          7F  7F  7F  FF



Here the code

SuperStrict

Import BaH.Cairo

Local cairo:TCairo = TCairo.Create(TCairoImageSurface.CreateForPixmap(100,100,PF_BGRA8888))

cairo.setantialias( CAIRO_ANTIALIAS_NONE )

Local normalizeMat:TCairoMatrix = TCairoMatrix.CreateScale(100.0,100.0)
cairo.SetMatrix(normalizeMat)


Local f:Double = 1.0 / 255

Local r:Double = f * $A0
Local g:Double = f * $B0
Local b:Double = f * $C0
Local a:Double = f * $D0

cairo.SetLineWidth(0.05)
cairo.SetSourceRGBA( 0.5, 0.5, 0.5, 1.0 )
cairo.Arc(0.5, 0.5, 0.4, 0, 360)
cairo.Stroke()
cairo.Destroy()


Local pix:TPixmap = CreatePixmap( 100, 100, PF_BGRA8888 )
pix = TCairoImageSurface( cairo.getTarget() ).pixmap()


Graphics 640,480,0
SetBlend ALPHABLEND
SetClsColor 255,255,255


Cls
DrawPixmap  pix,  0, 0
Flip


For Local x:Int=0 To 99
	For Local y:Int=0 To 99
		Local i:Int = ReadPixel( pix, x, y )
		If i<>$00000000 Then Print x +"  "+ y +"  "+ Hex( i )
	Next
Next


WaitKey()



Brucey(Posted 2006) [#21]
Hi, thanks for that.

The problem lies in the fact that Max has Alpha always at the same end, regardless, whereas, the Cairo image stuff switches around all four bytes on PPC... eg.. RGBA to AGBR (where pixmap uses RGBA on intel and RGBA on ppc)...

Will see if there's anything I can do to fix it...


Sean Doherty(Posted 2006) [#22]
Brucey,

It is probably obvious? But looking through the documentation that comes libxml_bin, I don't see how to deploy it to BlitzMax and start using it? Can you point me in the right direction?

Thanks


bradford6(Posted 2006) [#23]
Sean,

Here is an example i cooked up. give it a whirl:

This is the BlitzMax code:



Save this as "test.xml" in the same directory as above code snippet
This is the XML data:




Sean Doherty(Posted 2006) [#24]
Thanks, I will have a look at it. I don't have to place anything in the mod folder?


Fry Crayola(Posted 2006) [#25]
Just place the Bah.mod folder (that contains everything in libxml) into the mod folder (not in the pub.mod folder, just the mod folder itself).

It'll work then.


Sean Doherty(Posted 2006) [#26]
Is there a way search for a node without creating a list of children and iterating through the list? For example, from MaXml:

pSectorXmlNode.FindChild("Name").value


morszeck(Posted 2006) [#27]
You can write a function for this. MaXml is not it different... Look in to the module from MaXml.

@Brucey, libxml it's very fast. Thanks for integrating of libxml2 in BlitzMax!


bradford6(Posted 2006) [#28]
There is probably a much better way to accomplish what i did in the above example. That was my first shot a LibXML.

I am no XML wizard :)


Brucey(Posted 2006) [#29]
I should probably add a readme with the module to help people get started.

If you run docmods (it's in the BlitzMax/bin folder), it should integrate the libxml docs with BlitzMax. After restarting Max, you will be able to see it in the navigator pane in, Help -> Modules -> Libxml.

The libxml Help is fairly extensive, including two large tutorials (covering the Tree API and the Reader API) to help you get up an running.
Hopefully you will find it a little better documented than you are used to with Max ;-)

Any problems, lemme know...


morszeck(Posted 2006) [#30]
An question, then i which too stupid. How can created a new xml-document? More exactly: how do I create a root?

bah.libxml

local doc:txmldoc
local root:txmlnode

doc = txmldoc.newdoc( "1.0" )

... and now?

doc.savefile( "myxml.xml" )



bradford6(Posted 2006) [#31]
Brucey,

Thanks for an awesome module. I did a syncmods and there it is, all nicely organized in the the Max Doc pane. every example I have created has run flawlessly (except of course for 'user created' bugs)

I am going through your docs right now, the new ones are very good.

Morszek,

Here is a small example. You can read the output in an internet browser (Firefox, Opera, the Microsoft one...I forget the name of that one:) ):

' create and save a new XML document


SuperStrict				' SuperFly!


Import BaH.Libxml

Local fileURL:String = "created.xml"
Local Compression:Int = 0 		'ZLIB compression ranges from 0(none...fast) to 9(Max...Slow)
Local doc:TxmlDoc 
Local root:TxmlNode
Local node:TxmlNode


doc = TxmlDoc.newDoc("1.0")

root = Txmlnode.NewNode("RootNode")


doc.setRootElement(root)
node = root.addtextchild("monster")
node.addattribute("Name","Cuddly Monster")
node.addattribute("HairColor","brown")
node.addattribute("Ferocity","Rather Mild")
node.addattribute("FavoriteFood","Ice Cream")

node = root.addtextchild("monster")
node.addattribute("Name","Growler")
node.addattribute("HairColor","Red")
node.addattribute("Ferocity","Severe")
node.addattribute("FavoriteFood","Toes")

doc.setCompressMode(Compression:Int)
doc.saveFormatFile(fileURL:String, True)





bradford6(Posted 2006) [#32]
here is another example of creating and saving a file:

' create and save a new XML document
SuperStrict				' SuperFly!

Import BaH.Libxml

Local fileURL:String = "created2.xml"
Local Compression:Int = 0 		'ZLIB compression ranges from 0(none...fast) to 9(Max...Slow)
Local doc:TxmlDoc 
Local root:TxmlNode
Local node:TxmlNode

doc = TxmlDoc.newDoc("1.0")

root = Txmlnode.NewNode("Geography")


doc.setRootElement(root)
node = root.addchild("Country")
	node.addattribute("Name","USA")
	node.addattribute("population","Millions")

		node = node.addchild("State")
			node.setcontent("California")
			node = node.addchild("Governor")
				node.setcontent("Arnold Schwarzeneggar")
					node = node.addtextchild("PreviousJob")
						node.setcontent("Actor")
							node = node.addtextchild("BestMovie")
								node.setcontent("Terminator")


doc.setCompressMode(Compression:Int)
doc.saveFormatFile(fileURL:String, 1)




RepeatUntil(Posted 2006) [#33]
Hi, libXml is VERY nice and very easy to use! I am using it now, and I am very happy with it! Thank you, Brucey!
Now I have a question/problem: I like to incbin all external files in my projects. And I don't want that my xml files are visible to everyone and I don't want that the user could modify the xml. So I tried using incbin on the xml files using libxml, but that does not work. So, does someone tried to incbin xml files with this lib? Is there some support for that?? Could it be added?? If not, what are the alternative solution??
Thanks!


Brucey(Posted 2006) [#34]
I have not included access to ram-streams directly in the module, however, there is nothing stopping you loading the xml using a text stream (which I believe supports incbin) and then parsing that string using TxmlDoc.parseDoc(string).

I can also have a look at implementing access to incbin properly...

:-)


RepeatUntil(Posted 2006) [#35]
Do you mean something like this:

Incbin "myXmlFile.txt"
doc = TxmlDoc.parseFile("incbin::myXmlFile.txt")

In that case, this is not working for me, I got an error when I run:
I/O warning : failed to load external entity "incbin::myXmlFile.txt"

How should I do??

Thanks!


Brucey(Posted 2006) [#36]
No, it doesn't yet support memory streams at all... Looking at the API, I think I can probably get it to at some point, but for now....
SuperStrict

Import BaH.libxml
Import BRL.TextStream

Incbin "file.xml"

Local doc:TxmlDoc = TxmlDoc.parseDoc( LoadText("incbin::file.xml") )

' show the contents to stdout
doc.saveFormatFile("-", True)

I know it's not idea - to have to use LoadText() - but until I implement in-memory xml handling, it's the best there is...


taxlerendiosk(Posted 2006) [#37]
Bump. I don't really know much about XML yet, but I'm trying to learn. How do I apply a DTD validation on a TxmlDoc? I want to use a DTD that is "fixed" and applied by my Blitz program, not specified by the original XML document being parsed (that's what it means by "arbitrary DTD" I assume). I'm guessing it must have something to do with createExternalSubset and createInternalSubset but I don't see how to use them, and searching the Web isn't helping much.


taxlerendiosk(Posted 2006) [#38]
I'm looking for a way to be able to get all the attributes of an element/node, even ones I can't know the name of ahead of time, without using TxmlTextReader (which would use getAttributeByIndex). I thought the solution might be txmlnode.getChildren(XML_ATTRIBUTE_NODE), but that always just returns an empty set.


RepeatUntil(Posted 2006) [#39]
Hi Brucey,

I tried to use xpath with libxml2, but the example you are showing is not working at all!! I had a quick look, and I think you have a bug somewhere in your wrapper (no idea where).

Could you have a look at your example, and tell me if it works for you, and if not, could you try to debug it????

Many thanks! I am using your lib, and it is incredibly good!!


Brucey(Posted 2006) [#40]
I had trouble with xpath too...

I think perhaps I need to go googling for some examples of usage to see where I've gone wrong.


RepeatUntil(Posted 2006) [#41]
Here is a nice page:
http://www.w3schools.com/xpath/xpath_syntax.asp

Thanks!


Brucey(Posted 2006) [#42]
Thanks for the link...

After much testing, I found out what I was doing wrong.
Seems that I should have been iterating through a pointer array, rather than retrieving the first pointer and stepping thru its nodes...

Anyway, I'll let you know when I post the updated module.

I'm also thinking that I might do libxsl while I'm at it...


Difference(Posted 2006) [#43]
Thanks again Brucey
Getting libxslt would rock. :)


RepeatUntil(Posted 2006) [#44]
Thanks, thanks, thanks for your debugging!! I really need xpath, so I appreciate your effort!
I have no idea of what is libxslt... I googled it, but I still do not understand... Nevermind ;-)


Difference(Posted 2006) [#45]
libxslt lets you take an XML file and transform it using a XSL (xml stylesheet) and save the output. Some of the expressions you write in your XSL file can be XPath expressions. Like this: http://www.w3schools.com/xsl/xsl_transformation.asp

On windows you can do the same with Microsofts msxslt.exe: http://www.w3schools.com/xsl/xsl_transformation.asp

Having it inside BlitzMax, means you can write a stylesheet (XSL file) to transform XML, very usefull for converters/loading or making reports in XML,HTML,SVG formats.
OpenOffice lets you write document converters this way.


RepeatUntil(Posted 2006) [#46]
OK, good, but Brucey, please post your updated and debugged version of libxml (with xpath working) before working on libxslt... I need it;-)
Thanks again for your hard work!


Brucey(Posted 2006) [#47]
I thought that since I was fixing libxml I'd also work on adding more of the API to it.

As of last night, these are the current changes that will appear in the next release (which should be in a day or so) :

* Fixed TxmlNodeSet.getNodeList.
* API change - Added TxmlBase (for shared methods). Extended Node, Doc, Dtd, Attribute, Entity from it. Should be backwards compatible.
* Implemented debug-time assertion checking.
* Incbin support added for TxmlDoc and TxmlTextReader.
* Added more XPath functionality.
* Added libxml globals.
* Added Entities API.
* Added XML catalogs and SGML catalogs API.

For incbin support, you can now use : TxmlDoc.parseFile("incbin::afile.xml"), and the same for the textreader.

Will let you know when the update is available.

:o)


RepeatUntil(Posted 2006) [#48]
Wahou!!! This is great!!!! I can't wait!!


Brucey(Posted 2006) [#49]
If you are on win32 or linux you can now do :
syncmods -u <user> -p <pass> bah.libxml

(replacing <user> and <pass> appropriately)
to update the libxml module using syncmods :o)

(should make updating easier for everyone now...)

On top of the changes mentioned above :
* Added TxmlURI, TxmlCatalog, TxmlEntity, TxmlIncludeCtxt, TxmlLocationSet.
* Added XInclude API, XPointer API.
* misc doc/code fixes

Please let me know how you get on with it...

There's still plenty of API to include - it just takes so bloody long to get it coded / documented. Otherwise I'd have it all done by now.
The validation stuff will probably come next.
libxslt is well on the way


(many thanks to BRL for use of their modserver!)


RepeatUntil(Posted 2006) [#50]
Great! It's now downloaded. Xpath seems to work now. I will try to use it in my program...
<EDIT> Fantastic! I used without problem in my program. This is working perfectly, and save me a lot of time! Thanks again for this library!

<EDIT2> And the incbin stuff is working also perfectly! I can now incbin properly the xml file


Brucey(Posted 2006) [#51]
Glad to hear :-)

Good and constructive feedback helps to improve the code... thanks for your comments.


Sören(Posted 2007) [#52]
*Bump*

Ahm, I don't see any method/function for removing a node from a document. Something like xmlUnlinkNode? Or am I just not looking hard enough? :\


Brucey(Posted 2007) [#53]
Good find :-)

Interestingly, I added the API function call, but didn't implement it. Will fix. Thanks.