Download & Recover
BlitzMax Forums/BlitzMax Programming/Download & Recover
| ||
Hi I'm trying to figure out to resolve a problem. I'm using some function to download from a site a file. It works perfectly and it's quite fast. But I have this problem: if for some reason or error the program stops, even the download (of course) is interrupted. The file is 'broken' but saved. I created an output file to save the 'position' of reading, so technically I know 'where' the download was interrupted. The only problem is that SeekStream() (the function I use) doesnt' work on stream that cannot' be read, so everything start from 0. Any ideas or solutions? (I was thinking to use an external program like Wget that *should* support reloading from the break point - I think) |
| ||
There is a header you can pass to the server that will ask it to resume the file from a certain point. Not sure how you go about setting up everything manually, but libcurl has built-in support for doing stuff like this. Also, your server needs to support it too, or it won't work anyway. |
| ||
Ok, so I will focus on libcurl... thanks. So recover depends on the server side? The server are not mine in any case. |
| ||
Ok, I found a solution! It uses Lib_Curl (thanks to Brucey mod) After some internet research I discovered how to recover an interrupted download. I tested it with small files (300-500 Kb) because here my connection is capped to 100 MB/day... I will test with BIGGER files (maybe some ISO!) to check it everything works. I presume FileSize handle big size file.... |
| ||
It does not matter what download program you use: server A has a file client B wants the file B requests the file from A, no "startPosition" requested: A sends from position 0. If B pauses the download, or crashes or whatever it just has to request the file from B again but this time he could alter the position. If A "understands" what B requests, you will just receive the "rest" of the file. Server side you would do something in the lines of the following snippet (coming straight from my downloadscript I use to track files while using canonical urls mydomain.de/file/filexyz.zip) //check if http_range is sent by browser (or download manager) if(isset($_SERVER['HTTP_RANGE'])) { list($sizeUnit, $rangeOriginal) = explode('=', $_SERVER['HTTP_RANGE'], 2); if ($sizeUnit == 'bytes') { list($range, $extraRanges) = explode(',', $rangeOriginal, 2); }else{ $range = ''; header('HTTP/1.1 416 Requested Range Not Satisfiable'); exit; } }else{ $range = ''; } //figure out download piece from range (if set) list($seekStart, $seekEnd) = explode('-', $range, 2); //set start and end based on range (if set), else set defaults //also check for invalid ranges. $seekEnd = (empty($seekEnd)) ? ($fileSize - 1) : min(abs(intval($seekEnd)),($fileSize - 1)); $seekStart = (empty($seekStart) || $seekEnd < abs(intval($seekStart))) ? 0 : max(abs(intval($seekStart)),0); //Only send partial content header if downloading a piece of the file (IE workaround) if( $seekStart > 0 || $seekEnd < ($fileSize - 1) ) { header('HTTP/1.1 206 Partial Content'); header('Content-Range: bytes '.$seekStart.'-'.$seekEnd.'/'.$fileSize); header('Content-Length: '.($seekEnd - $seekStart + 1)); }else header("Content-Length: $fileSize"); header('Accept-Ranges: bytes'); fseek($fileHandler, $seekStart); So what you might see there: your request to the server needs to contain the field "HTTP_RANGE" which sets what you want from a file/request. bye Ron |
| ||
Hi I know there are 'negotiations header', but it seems that LibCurl handles it automatically (of course the server MUST accept & support resume download). On my site this is true (without changes by me). I will test on other website. Of course if the server doesn't handle this, there's very little to do than re-download the file. Bye |
| ||
That handling is of course done by the "server" (apache2, nginx, ...). But as soon as you route your files through scripts (eg. they dynamically inject affiliate-code, code to identify a specific user), you have to include a snippet like the above in that scripts code. bye Ron |
| ||
Oh, yes, thank's your script is very useful for this. But my idea is to create a 'general download manager' in BlitzMax, so I have no idea from 'where' the files are downloaded (ie:BlitzBasic.com or xyz.com, google.com, adobe.com etc) If the server support resume, good, otherwise - at this point - I will force a redownload. Thanks to all! |
| ||
general download manager How general? Over HTTP is easy. FTP gets a bit more difficult. HTTPS and SFTP you'll need to use the libcurlssl module - which includes support for certificates, etc. |
| ||
with 'general' I mean not locked to my website. And I dont' want to touch FTP things or HTTPS if possible! |