Generators / yield

BlitzMax Forums/BlitzMax Programming/Generators / yield

Yasha(Posted 2014) [#1]
More language extensions... this time, a Generator class.

generator.bmx:


generator.S (compile with 'gcc -m32 -c generator.S' to get generator.o):



And an simple (worthless) example similar to the Python one on the wiki:


Uncomment the Print lines to see weird control flow in action.

It struck me after beginning this that for all practical purposes, an ObjectEnumerator is a generator anyway, it just doesn't make the Yield command explicit... well, this offers an explicit Yield command that can express slightly more unreadable logic.

The example should be pretty straightforward, but anyway - to implement your own generator function, subclass TGenerator and give it a Run() method; return objects via Yield. The Run() method acts as the body of the generator function. Do not invoke Run() directly - it will probably crash your program if you do.

Assuming you're using the generator in a For-EachIn loop, the generator will continue to return objects and resume execution after Yield until either the loop kills it (with Exit, as usual), or the generator calls Done (which will tell the loop that there are no more values). If you're not using a loop (?), you can resume the generator (or start it for the first time) with Resume. The generator can also have its position reset with Reset, causing it to forget where it Yielded from and roll from the top on the next run. (Reset does not affect anything else, e.g. object fields.)

The Count.From function and the parameter field .n in the example are not integral to generators, and you can use this sort of thing, or not, as you like.


Minimally tested on OSX only so far.


Who was John Galt?(Posted 2014) [#2]
What's the point of this? Serious question not a troll. What sort of thing is it used for?


Yasha(Posted 2014) [#3]
Two parts to answering that.

Firstly, the technical functionality that this adds is a "Yield" keyword (I've implemented it as a method, but you should think of it as a keyword). "Yield" (in the context of generators) is like "Return" except it remembers where you Yielded from, and the function can be resumed immediately afterwards. (Coroutine yield is more complicated... and not relevant here.) So if you yield in the middle of a loop, you can resume the function and continue the rest of the loop and iterate again. (If I got this right, it should record the state of all the local variables too, minimizing the need for instance fields as locals.)

Secondly, the concept: what generators are usually for is supplying objects to a consumer - such as a loop body, shown here. The conceptual difference between an enumerator and a generator is that an enumerator, as we usually see in BlitzMax, takes objects from an existing collection; a generator is normally lazy, that is, it creates the stream of objects on-demand (as shown in the example with an infinite list of numbers). A generator is thus more general than an enumerator: it can handle infinite sequences (obviously can't be created in advance), or it can specialize each instance in response to something that happens in between (cooperative tasks), such as perhaps the loop body providing feedback ("no! you're doin' it wrong!").

The Yield command makes it easier to express the production of objects in terms of a stream, instead of as a sequence: you can write you object production as a single loop, and it looks like a production loop is passing things "up" to a function (when in fact the loop is being paused and the objects are being passed down to the consumer). That way you can have separate but interleaved producer/consumer loops. (You could envisage this in terms of two threads, except each one has to pause the other while it does its thing then hand back control, so it only needs one OS thread.)

Mechanically speaking there's nothing to stop you using ObjectEnumerator to create lazy streams, as I mentioned in the OP; but it's a little less expressive as the enumerator is always locked into looking like a higher level that has to "return" complete objects. With generators you're able to write object production as a loop, which may come closer to what you actually want to express by making it look like a production stream.


If that didn't make sense (which it probably didn't), the best I can suggest is to look at the wiki articles for Generator and Coroutine, and maybe find some Python tutorials on its version of this.


Who was John Galt?(Posted 2014) [#4]
Thanks Yasha for the detailed explanation. I'm leaving it to sink in for a bit, then I'll be taking a look at those links.


Yasha(Posted 2014) [#5]
I'm ashamed to admit that there was a really stupid mistake in this, which I think has now been fixed - it didn't actually save the state of all local variables (which I didn't notice because the example used a field - now it uses a local to demonstrate the ...uh, point). Worse, it corrupted others in the rest of the program! (Basically I had forgotten entirely about the existence of callee-save registers.)

Please copy the code again if you grabbed the initial version.


Derron(Posted 2014) [#6]
I had to wait for the explanation too - so thanks @ John for asking.

@do not run "run()" directly ... what about naming such functions "_Run" so people knowing this convention know what it means without BlitzMax having the private-functionality.


@functionality
I can think of thingies like the generator doing "BeforeRun" and then "Run" so things can be done hidden/automagically.
May save Code when extending objects, overriding functions/methods which need then "super.methodname()" within to call the original function.

Other ideas how this little pearl could be used?


bye
Ron


Yasha(Posted 2014) [#7]
'BeforeRun'? I don't follow.... This doesn't provide any obvious means to handle "silent extension" (AKA "advice"), at least not that I can see. That would be a useful feature but I don't see the connection to generators.

I was going to try to make an Aspects module next, though, if that interests you.


Derron(Posted 2014) [#8]
"BeforeRun" ...

Something in the likes of:

for local obj:TMyObjects = eachin generator.getEach()
obj.save()
next

now that getEach calls obj.BeforeSave() before it returns the object to the for loop. So that "BeforeSave" does not get called in TMyObjects.Save() (which could be extended multiple times without calling "super.save()" in the overriden save()-function.


Now I feel kind of dumb :D

bye
Ron