Fastest Way to Extract Text from HTML

Monkey Forums/Monkey Programming/Fastest Way to Extract Text from HTML

c.k.(Posted 2012) [#1]
What's the fastest/easiest way to extract the auth_token and player_id values in the following HTML (XML?)?

<reply tick="135413803421">
  <user>
    <login status="ok">
      <auth_token>5125896321458745123</auth_token>
      <player_id>51425879563256985458</player_id>
    </login>
  </user>
</reply>



c.k.(Posted 2012) [#2]
I wrote this as a cheat:

[monkeycode]
Function get_tag:String(txt:String, what:String)
' find tag 'what' in 'txt'
Local st:String = "<" + what + ">"
Local i:Int = txt.Find(st)
Local y:Int = i + st.Length()
Local j:Int = 0

If i > 0 Then
' now find the closing tag
st = "</" + what + ">"
j = txt.Find(st)
If j > y Then
Return txt[y .. j]
EndIf
EndIf
Return ""
End
[/monkeycode]

But I'm wondering if there's a module out there where I could do:

[monkeycode]
xml.GetTag("auth_token")
[/monkeycode]


therevills(Posted 2012) [#3]
Thats XML... and Diddy has an XML parser ;)


c.k.(Posted 2012) [#4]
Are there docs for using Diddy's XML parser?


therevills(Posted 2012) [#5]
Are there docs for using Diddy's XML parser?


Haha very funny ;)

Heres a quick example:
[monkeycode] Local authTokenViaLoop:String
Local authTokenViaChild:String

Local file:String = "test.xml"
Local xmlReader:XMLParser = New XMLParser
Local doc:XMLDocument = xmlReader.ParseFile(file)
Local rootElement:XMLElement = doc.Root

' using For Loops to cycle between elements
For Local userXml:XMLElement = Eachin rootElement.GetChildrenByName("user")
For Local loginXml:XMLElement = Eachin userXml.GetChildrenByName("login")
authTokenViaLoop = loginXml.GetFirstChildByName("auth_token").Value
Next
Next

' just using FirstChildByName
Local userNode:XMLElement = rootElement.GetFirstChildByName("user")
Local loginNode:XMLElement = userNode.GetFirstChildByName("login")
Local authTokenNode:XMLElement = loginNode.GetFirstChildByName("auth_token")
authTokenViaChild = authTokenNode.Value

Print "authTokenViaChild= " + authTokenViaChild
Print "authTokenViaLoop = " + authTokenViaLoop
[/monkeycode]

You would use the For Loop way if you had multiple users in your XML.


Why0Why(Posted 2012) [#6]
Steve,

I understand I am a little thick sometimes, but I wanted to use XML to store data for a roguelike(monsters, items, etc.) Could you show me how I would pull say two different monsters with two or three elements into a custom type?

[bbcode]
Class Monster
Field Name:String
Field HitPoints:Int
End
[/bbcode]


therevills(Posted 2012) [#7]
I did something similar for my Monkey Touch Tower Defense game:

http://code.google.com/p/monkey-touch/source/browse/trunk/game8.monkey

But for a quick example:

monsters.xml
<monsters>
	<monster>
		<name>WereWolf</name>
		<hitpoints>100</hitpoints>
	</monster>
	<monster>
		<name>Vampire</name>
		<hitpoints>50</hitpoints>
	</monster>
</monsters>

Monkey/Diddy code:[monkeycode]Strict

Import diddy

Global titleScr:Screen = New TitleScreen()

Function Main:Int()
New MyGame()
Return 1
End

Class MyGame Extends DiddyApp
Method OnCreate:Int()
Super.OnCreate()
game.Start(titleScr)
Return 0
End
End

Class Monster
Global list:ArrayList<Monster> = New ArrayList<Monster>
Field name:String
Field hitPoints:Int
Field x:Int, y:Int

Method New(name:String, hp:Int)
Self.name = name
Self.hitPoints = hp
Self.x = Rnd(100, 300)
Self.y = Rnd(100, 300)
list.Add(Self)
End

Method Draw:Void()
DrawText (name + ": " + hitPoints + "HP", x, y)
End

Function DrawAll:Void()
If Not list Return
Local m:Monster
For Local i:Int = 0 Until list.Size
m = list.Get(i)
If m <> Null Then m.Draw()
Next
End
End

Class TitleScreen Extends Screen
Method Start:Void()
Local file:String = "monsters.xml"
Local xmlReader:XMLParser = New XMLParser
Local doc:XMLDocument = xmlReader.ParseFile(file)
Local rootElement:XMLElement = doc.Root

Local name:String
Local hitPoints:Int
For Local monsterXml:XMLElement = Eachin rootElement.GetChildrenByName("monster")
name = monsterXml.GetFirstChildByName("name").Value
hitPoints = Int(monsterXml.GetFirstChildByName("hitpoints").Value)
New Monster(name, hitPoints)
Next
End

Method Render:Void()
Cls
Monster.DrawAll()
End

Method Update:Void()
If KeyHit(KEY_ESCAPE)
FadeToScreen(game.exitScreen)
End
End
End[/monkeycode]


Samah(Posted 2012) [#8]
I just have to post my own code here for completeness... XD
[monkeycode]Class Monster
Field name:String
Field hitPoints:Int
End

Class MonsterReader
Function ReadFile:ArrayList<Monster>(filename:String)
Local rv:ArrayList<Monster> = New ArrayList<Monster>
Local parser:XMLParser = New XMLParser
Local doc:XMLDocument = parser.ParseFile(filename)
For Local monsterNode:XMLElement = EachIn doc.Root.GetChildrenByName("monster")
Local monster:Monster = New Monster
monster.name = monsterNode.GetFirstChildByName("name").Value
monster.hitPoints = Int(monsterNode.GetFirstChildByName("hitpoints").Value)
rv.Add(monster)
Next
Return rv
End
End

Function Main:Int()
Local monsters:ArrayList<Monster> = MonsterReader.ReadFile("monsters.xml")
For Local monster:Monster = EachIn monsters
' do stuff with monster
Next
Return 0
End[/monkeycode]


Why0Why(Posted 2012) [#9]
Thanks guys, that is extremely helpful. I am getting ready to work on some data loader classes and that is exactly what I need. I really appreciate the help and taking the time for a nice example.


c.k.(Posted 2012) [#10]
I don't understand this part:

[monkeycode]
Class Monster
Global list:ArrayList<Monster> = New ArrayList<Monster>
[/monkeycode]

Why wouldn't that Global be outside the class? Aren't you creating a new global list for each Monster you create?


therevills(Posted 2012) [#11]
Why wouldn't that Global be outside the class?

Because you want to encapsulate the data, so everything related to Monsters are part of the Monster class.

Aren't you creating a new global list for each Monster you create?

Nope, since it is global it is static. So you can create 1000 monsters but you will only have one monster list.

To access the monster list from anywhere you just do:
[monkeycode]Monster.list[/monkeycode]


c.k.(Posted 2012) [#12]
So, the "Global" means, "available to all Monster objects," not, "available to everybody everywhere?"

Well, I just noticed you said it is available "from everywhere."

o_O

I guess that's just how it is.


therevills(Posted 2012) [#13]
You can of course create your lists outside of a class:
[monkeycode]Global monsterList:ArrayList<Monster> = New ArrayList<Monster>
Global bulletList:ArrayList<Bullet> = New ArrayList<Bullet>
Global heroList:ArrayList<Hero> = New ArrayList<Hero>

Class Monster
End

Class Bullet
End

Class Hero
End[/monkeycode]

But when using object-oriented programming you do want to try to encapsulate everything together when possible.

With the above code, if the classes use their lists, I can no longer just copy the classes and put them into another project as I need to copy the outside global lists, whereas if the lists where part of the class this wouldnt be an issue.

Well, I just noticed you said it is available "from everywhere."
Yep as long as you reference the class first: Monster.list


c.k.(Posted 2012) [#14]
Ahhh. OK. Thanks! :-)