[Solved] Read a webpage content ?

Monkey Targets Forums/HTML5/[Solved] Read a webpage content ?

golomp(Posted 2015) [#1]
Hi,

With Monkey we can get and post information with httprequest(). It works fine.

Is it possible to read a webpage itself ?

I mean for example for a webpage like this :
<!DOCTYPE html PUBLIC "-//IETF//DTD HTML 2.0//EN">
<HTML>
   <HEAD>
      <TITLE>
         A Small Hello
      </TITLE>
   </HEAD>
<BODY>
   <H1>Hi</H1>
   <P>This is very minimal "hello world" HTML document.</P>
</BODY>
</HTML>


Is it possible to read for example the 21 first chars and get the string :
"<!DOCTYPE html PUBLIC"


Midimaster(Posted 2015) [#2]
of course! the Class HttpRequest has a lot of methods. See the Monkey Quick Help or quick reference in right corner of Ted.

req:HttpRequest
....
    Method OnHttpRequestComplete:Void( req:HttpRequest )
        Print "ResponseText="+req.ResponseText()     
    End



golomp(Posted 2015) [#3]
Thank you Midimaster.

I have some difficulties to use your answer. In the sample furnished with monkey concerning httprequest,
it require a parameter ("POST" or "GET").

Do you have an example ?


TeaBoy(Posted 2015) [#4]
@golomp,

You could do something like this:

Import mojo
Import brl.httprequest

Class GetWebPage Extends App Implements IOnHttpRequestComplete 

  Const httpLink     : String = "http://127.0.0.1/mywebsite.html"
  
  Field getRequest : HttpRequest = Null
  Field getText       : String = ""
  
  Method OnCreate()
    getRequest = New HttpRequest()
    
    SendRequest(httpLink)
	
    SetUpdateRate(60)
  End Method
  
  Method OnHttpRequestComplete:Void(request : HttpRequest)
    getText  = request.ResponseText()
  End Method
  
  Method SendRequest(link : String)
    getRequest.Open("GET", link, Self)
    getRequest.Send
  End Method
  
  Method OnRender()
   Cls(0, 0, 0)
   DrawText(getText, 0, 0)
  End Method
  
  Method OnUpdate()
    UpdateAsyncEvents
  End Method

End Class

Function Main()
  New GetWebPage()
End Function



golomp(Posted 2015) [#5]
Hi TeaBoy,

Thank you.

Did you try it ? I dont arrive to use it. I only obtain a black screen.
i put an html file in my data directory, i also tried on a webpage on internet (simple html)
and same result : black screen. I use Firefox.
Normally GET is to access php posted variable, no ?


TeaBoy(Posted 2015) [#6]
ah, sorry, I thought you mean reading a page on your own webserver, I don't think you can use httprequest
that way due to security concerns.

[EDIT] just a thought

Import mojo
Import brl.httprequest

Class GetWebPage Extends App Implements IOnHttpRequestComplete 

  Const httpLink : String = "http://127.0.0.1/getpage.php?page=http://www.google.com"

  Field getRequest : HttpRequest = Null
  Field getText : String = ""
  
  Method OnCreate()
    getRequest = New HttpRequest()
	
	SendRequest(httpLink)
	
    SetUpdateRate(60)
  End Method
  
  Method OnHttpRequestComplete:Void(request : HttpRequest)
    getText  = request.ResponseText()
  End Method
  
  Method SendRequest(link : String)
    getRequest.Open("GET", link, Self)
    getRequest.Send
  End Method
  
  Method OnRender()
   Cls(0, 0, 0)
   
   DrawText(getText, 0, 0)
  End Method
  
  Method OnUpdate()
    UpdateAsyncEvents
  End Method

End Class

Function Main()
  New GetWebPage()
End Function


getpage.php

<?php
  $getpage = $_GET['page'];
  
  $getPage = file_get_contents($getpage);

  echo $getPage;
?>



golomp(Posted 2015) [#7]
Hi TeaBoy,

Yes i think php is a good alternative solution. Like this i can manage a POST variable
and take it with a GET request.

Thank you TeaBoy.


Landon(Posted 2015) [#8]
i use this all the time to do php requests to get database info back, usually i just put xml as my php header then to a loadstring with a parser. works wonderfully.


golomp(Posted 2015) [#9]
Thank you Landon i will also consider this solution.


Landon(Posted 2015) [#10]
here's a quick example

i'm using diddy in my app but for instance to get login credentials i implemented httprequest on a diddy screen, then i connect to my website which i am using wordpress with a wordpress plugin i wrote to handle the request.


Class TitleScreen Extends Screen Implements IOnHttpRequestComplete
	Field menu:SimpleMenu
	Field musicFormat:String
	Field bgim:Image
	Field widgets:WidgetManager
	Field login:VCTextBox
	Field pass:VCTextBox
	Field loginbutt:ImagePushButton
	Field get_req:HttpRequest
	Field islogging:Bool = False
	Field loadspin:Int = 360
	Field errors:String
	Field playsxml:XMLElement
	Method New()
		name = "TitleScreen"
	End
	
	Method Start:Void()
		widgets = New WidgetManager(UICURSOR)

		#If TARGET="flash" Or TARGET="ios" Or TARGET = "android"
			bgim = LoadImage("monkey://data/graphics/gamescreen.png")
		#Else
			bgim = LoadImage("monkey://data/graphics/gamescreenan.png")
		#Endif
		

		login = New VCTextBox(SCREEN_WIDTH/2-150, 280, 300, 30, Mainfont, New VCTextboxSkin, UICURSOR)
		pass = New VCTextBox(SCREEN_WIDTH/2-150, 360, 300, 30, Mainfont, New VCTextboxPassSkin, UICURSOR)

		loginbutt = New ImagePushButton("Login",SCREEN_WIDTH/2-150,500, 300, 40, UICURSOR,BaseButton)
		widgets.Attach(login)
		widgets.Attach(pass)
		widgets.Attach(loginbutt)
	End
	
	Method Render:Void()
		If HaveBrush = False Then
			Cls
			DrawImage(Brush,0,0)
			BrushPixmap = New Int[Brush.Width*Brush.Height]
			ReadPixels( BrushPixmap, 0, 0, Brush.Width, Brush.Height )
			HaveBrush = True
		Endif
		Cls
		'DrawText "TITLE SCREEN", SCREEN_WIDTH2, 10, 0.5, 0.5
		DrawImage(bgim,0,0)
		Gamefont.DrawHTML("Username:", SCREEN_WIDTH/2-150,250)
		Gamefont.DrawHTML("Password:", SCREEN_WIDTH/2-150,330)	
		Gamefont.DrawHTML(errors, SCREEN_WIDTH/2-150,530)		
		widgets.RenderAll()
		If islogging = True Then
			SetAlpha(0.5)
			SetColor(0,0,0)
			DrawRect(0,0,SCREEN_WIDTH,SCREEN_HEIGHT )
			SetAlpha(1)
			SetColor(255,255,255)
			DrawImage(Asyncload,SCREEN_WIDTH2,SCREEN_HEIGHT2,loadspin,1,1,0 )
			loadspin = loadspin - 5
			If loadspin < 0 Then loadspin = 360
		Endif
	End
	
	Method ExtraRender:Void()
		'menu.Draw()
	End

	Method OnHttpRequestComplete:Void( req:HttpRequest )
	
				Print "Status="+get_req.Status()
				'Print "Loaded: "+get_req.ResponseText()
				Local parser:XMLParser = New XMLParser
				'Print get_req.ResponseText()
				If get_req.ResponseText() <> "" then
					Local xmldoc:XMLDocument = parser.ParseString(get_req.ResponseText())
					Local root:XMLElement = xmldoc.Root
					If root.Children().Length() > 0 Then
						Local id:Int = 0
						For Local playnode:XMLElement = Eachin root.Children()			
							If Int(playnode.GetAttribute("id")) > 0 Then
								errors = ""
								playsxml = playnode
								playlistScreen = New PlayListScreen
								FadeToScreen(playlistScreen)
								islogging = False
							Elseif Int(playnode.GetAttribute("id")) = -1 then
								errors = "Incorrect Login!"	
								islogging = False
							Elseif Int(playnode.GetAttribute("id")) = -2 then
								errors = "Subscription is inactive or expired!"	
								islogging = False
							Elseif Int(playnode.GetAttribute("id")) = -3 then
								errors = "You do not have authorization to view!"	
								islogging = False							
							Endif
						Next		
					Else
						errors = "COULD NOT CONNECT!"
						islogging = FALSE
					Endif
				Else
				
						errors = "COULD NOT CONNECT!"
						islogging = FALSE				
				Endif
	End
	
	Method Update:Void()
		UpdateAsyncEvents
		UICURSOR.Poll()
		widgets.PollAll()	
		
		If loginbutt.hit And islogging = false Then
			PlaySound(buttonclick)
			islogging = True
			
			get_req=New HttpRequest( "POST","http://mywebsiteurl.com/wp-admin/admin-ajax.php",Self )
			get_req.Send("action=vcgplays&username="+login.Text+"&password="+pass.Text, "application/x-www-form-urlencoded")
		Endif
		
	End
End


then in my wordpress plugin i have an ajax function that goes something like this.

function vcgetplays() {

	$creds = array();
	$creds['user_login'] = $_POST['username'];
	$creds['user_password'] = $_POST['password'];	
	$user = wp_authenticate( $_POST['username'],$_POST['password']);
	if ( is_wp_error($user) )
	{
		   header ("Content-Type:text/xml");
		   echo '<?xml version="1.0"?>';
		   echo '<VCFB>';		  
		   echo '<PLAYLIST added="0" id="-1">
		   '.$_POST['username'].' '.$user->get_error_message();
		   echo '</PLAYLIST>';
	       echo '</VCFB>';		
		   die();
	}
	else
	{
		//$coach = *cut this out for security*;
		//$status =  *cut this out for security*;
		
		if($status==true)
		{

				$allowedplay = true;
	
			if($allowedplay==true)
			{
				header ("Content-Type:text/xml");
				echo '<?xml version="1.0"?>';
				echo '<VCFB>';	  
				
				echo '<PLAYLIST id="'.$coach.'">';
				$args = array(
					'posts_per_page'   => 100,
					'offset'           => 0,
					'author'         => $coach,
					'orderby'          => 'post_date',
					'order'            => 'DESC',
					'post_type'        => 'vcfb_playlist',
					'post_status'      => 'publish',
					'suppress_filters' => true );		
				$posts_array = get_posts( $args );
				foreach ( $posts_array as $post )
				{
					echo '<PLAY name="'.$post->post_name.'" id="'.$post->ID.'">';
						$url = wp_get_attachment_url( get_post_thumbnail_id($post->ID),'thumbnail' );
						echo '<THUMB src="'.$url.'" />';
						$meta_value = get_post_meta( $post->ID, 'playlistfile', true );
						if(!empty($meta_value))
						{
							echo '<FILE src="'.$meta_value.'" />';
						}
					echo '<SCORE>';
					$hscores = get_post_custom_values('highscores', $post->ID);
					echo $hscores[0];
					echo '</SCORE>';	
					echo '</PLAY>';
				
				
				}
				echo '</PLAYLIST>';
				echo '</VCFB>';		
				die();
			} else {
			   header ("Content-Type:text/xml");
			   echo '<?xml version="1.0"?>';
			   echo '<VCFB>';		  
			   echo '<PLAYLIST added="0" id="-3">
			   You are not Authorized to view these plays!';
			   echo '</PLAYLIST>';
			   echo '</VCFB>';		
			   die();				
			}
		}
		else
		{
		   header ("Content-Type:text/xml");
		   echo '<?xml version="1.0"?>';
		   echo '<VCFB>';		  
		   echo '<PLAYLIST added="0" id="-2">
		   EXPIRED SUBSCRIPTION';
		   echo '</PLAYLIST>';
	       echo '</VCFB>';		
		   die();		
		}
	}
}


Basically this wordpress plugin i wrote checks the login info, authenticates it, then fetches a list of plays as xml and sends it back to the app.


golomp(Posted 2015) [#11]
Thank you a lot to share this impressive example Landon.

Monkey is really able to do a lot of things.

...and sometime i use it also for games :)

Thank you again Landon.


diemar(Posted 2016) [#12]
I am still having problems in understanding security limitations in cross server or source situations. For instance, LoadString(url) works perfectly in html5 when accessing eg. a html page on the same localhost as monkeygame.html, served by apache, but when I try to access eg. an outside url like yahoo, or a file:// address (or vice versa) then opera throws security violation scripterrors on XP, and even when monkeygame is hosted by xy.com, it cannot loadstring from ab.com. Was that probably a bug in my old version 84f? (yes I may update,btw. what is recc. source for latest win binary release? thanks)

So, my problem condenses to the simple question: what is the easiest and most reliable way to load a html page from the web into a js string, then document.write it to an iframe, in the html5 target ?

it would be great if it worked with web content when hosted locally by the client side (file://), using the sandbox option for the iframe, but is also ok when it needs to be hosted by a public webserver,

All help greatly appreciated, thanks.











.


diemar(Posted 2016) [#13]
this is really my top priority. I've uploaded a testfile, maybe it's just my browsers, could somebody with a confirmed html5 browser test it? enter a web url in the text field. The app will try to loadstring the page snd write this string into an iframe:

http://anticrap.byethost3.com/

thanks


Xaron(Posted 2016) [#14]
Is that legit? I get a blocked message from our proxy because of "Phishing".

Beside that I think there is no easy way for a HTML5 app to load external stuff. At least I haven't found out how yet.


diemar(Posted 2016) [#15]
it is on byethost, free hosting. When i uploaded monkeygame.html and tried to load it in the browser, i was redirected to some chinese page. After naming it index.html it was loaded, but indeed, these free hosts sometimes insert their scripts, sniffing around maybe.
.However, my upload is a monkey html5 compilation, a form that takes an url, tries to load it using loadstring, filter it, then write it to an iframe. Not fishing to me, more like anti-fishing. But, right, these free hosts... bad things turn worse, murphys law. Remember goodol last Millenium? ^^ Anyway. Thanks for trying. I'll keep on searching, it's against the very nature of the web not to be able to do that. Every browser does it.