Info on streams

BlitzMax Forums/BlitzMax Programming/Info on streams

JoshK

(Posted 2008) [#1]

I am thinking about a paging system which would swap memory to and from the hard drive and have some questions. I don't know what the limiting factors of stream reading and writing are.

-Is there any cost of traversing a stream? If I do stream.seek(5000) from position 0, is there a cost of this operation dependent on position?

-Would it be better to have ten streams open and read them each from the beginning, or could I have one big stream with all the data, and jump around however I want at no cost?

-Can BlitzMax streams feasibly be used for quickly reading and writing large linear chunks of data to and from memory in real-time?

Basically I am just trying to determine what the nature of dynamic stream reading is.

ImaginaryHuman

(Posted 2008) [#2]

You trying to do paged geometry or something?

I would presume that seeking in a file does take a little amount of time - it may involve the hard drive head having to move to get to the data. If you could read the file in more of a sequence or optimized for consecutive sections being read I think that would possibly be better than jumping around a lot. `Seek time` is a harddrive issue and once you've `sought` you ideally want to just read as much as possible. Hopefully your file would not be too big as to be scattered all over the place on the harddrive. It would depend slightly on which o/s you are on, like MacOS does some automatic defragmenting when you use the system installer.

I would think you'd want to examine what scenario you are going to use this data for and how much random access such a scenario will entail, like if you are paging sections of landscape or mesh data, it may be beneficial to store `areas` in a single stream and then within that stream only have a few sub-areas to seek for.

One thing to consider is if you have multiple streams that means probably multiple files which means scattered data which means longer seek times, unless you're thinking to have multiple streams to the same file (is that possible?)

You would have to do some tests to check the speed, and ideally threading would be really helpful here, but the file i/o stuff is mainly done by the o/s so it should be pretty quick and not too hindered by Max itself. Also you have to factor in things like disk caches and stuff. It's obviously been done, like in the megatexture system, so it is doable, but I would probably approach it by trying to stream approximately the same amount of data per frame and make sure to spread out the upload as thinly as possible given the amount of time you have to load or save it. Ideally you want to save/load just a small chunk of data per frame if possible. Given that we don't really have cross-platform threads yet, you'll probably want to sort of `manage` the streaming yourself, ie loading a few chunks per frame, rather than just let some stream thread run in the background.

JoshK

(Posted 2008) [#3]

Something like that. The terrain would be split into 128x128 sectors, each of which would have a height array, normal map, and alpha map. If I can get it working with a small area, then infinitely large terrain is not a problem.

I actually have no interest in the megatexture approach, but with a 16384x16384 terrain, you can't store the normal data or alpha layers in system memory.

A company has expressed interest in the addition of this to our engine, in the form of a lot of money.

Otus	(Posted 2008) [#4]

Unless I'm totally wrong, I remember that file streams try to read the whole file to memory at first (see BRL.Stream.ReadFile), so there would probably be little extra cost jumping around the stream. That also means you won't free any memory before closing the file.

Brucey

(Posted 2008) [#5]

You can create your own stream types which can work the way you need...

BlitzSupport

(Posted 2008) [#6]

Would it be better to have ten streams open and read them each from the beginning, or could I have one big stream with all the data, and jump around however I want at no cost?

You might want to look into filemapping, at least for Windows (you'd have to research how it's done on the other OSes).

You're placing a memory 'window' of a particular size (64K) over a 'map' of the file and then operating on the contents of that memory 'window', moving it back and forth, etc, but it's the fastest way to read/write files in Windows, since it's carried out by the Virtual Memory Manager's low level disk access functions, rather than the high level file operations you'd normally have access to:

  ... - file
  [ ] - 64K window

....[...]...................

Although you tell it to map the whole file, it only places 64K into memory at a time, the optimum 'data chunk' size on Windows. Here's a PB example (apologies)...

Procedure ReadMappedFile (f$) 
    
    ; Used to pass 64-bit number to MapViewOfFile in two parts... 
    
    Structure HiLo 
        hi.l 
        lo.l 
    EndStructure 

    ; Filemapping REQUIRES use of system's most efficient memory chunk size... 
    
    GetSystemInfo_ (info.SYSTEM_INFO) 
    chunk = info\dwAllocationGranularity ; Usually 64K / 65536 
    
    view.HiLo 

    ; How many times to read 'chunk' bytes in For/Next loop... 
    
    fsize.q = FileSize (f$) 
    loops.q = fsize / chunk 
    remainder = fsize % chunk 

    If remainder 
        loops = loops + 1 ; Extra loop for remainder if less than 'chunk' bytes 
    EndIf 
    
    ; Get file handle... 
    
    file = CreateFile_ (@f$, #GENERIC_READ, 0, #Null, #OPEN_EXISTING, #FILE_ATTRIBUTE_NORMAL | #FILE_FLAG_SEQUENTIAL_SCAN, #Null) 
    
    If (file <> #INVALID_HANDLE_VALUE) 
    
        ; Create file map, defaulting to the whole file size... 
            
        mapped = CreateFileMapping_ (file, #Null, #PAGE_READONLY, 0, 0, #Null) 
    
        If mapped 

            ; Read 'chunk' bytes (65536 generally) however many times are needed... 
            
            For byte = 0 To loops - 1 
            
                ; Fill HiLo structure with current filemap offset... 
                
                PokeQ (@view, byte * chunk) 
                
                ; Hack for remainder on last loop if smaller than 'chunk' bytes... 
                
                mapbytes = chunk 
                If byte = loops - 1 
                    If remainder 
                        mapbytes = remainder 
                    EndIf 
                EndIf 
                
                ; Position 'chunk' sized view into filemap... 
                
                mapview = MapViewOfFile_ (mapped, #FILE_MAP_READ, view\lo, view\hi, mapbytes) 
                
                If mapview 

                    ; ------------------------------------------------------ 
                    ; Process 'mapbytes' bytes from mapview address here, eg... 
                    ; ------------------------------------------------------ 
                    
                            For offset = 0 To mapbytes - 1 
                                asc = PeekB (mapview + offset) 
                                ; Debug Chr (asc) 
                            Next 

                    ; ------------------------------------------------------ 
                    ; End of mapview processing 
                    ; ------------------------------------------------------ 

                    ; Free map view... 
                    
                    UnmapViewOfFile_ (mapview) 
                    
                EndIf 
                            
            Next 
    
            ; Close filemapping handle... 
            
            CloseHandle_ (mapped) 
            
        EndIf 
            
        ; Close file handle... 
            
        CloseHandle_ (file) 
        
    EndIf 
    
EndProcedure 

; ReadMappedFile ("test.txt")

For writing, you'd change the constants passed to CreateFile, CreateFileMapping, MapViewOfFile, etc.