Giving CPU power other processes

BlitzMax Forums/BlitzMax Programming/Giving CPU power other processes

Grey Alien(Posted 2007) [#1]
Hi, do you have any ideas how I can stop some BMax code (non-GUI) like the code below from hogging 50% (viewed in task manager) of the CPU time on my Hyperthreading PC? (assume it would be 100% on non-dual core/hyperthread CPUs)

I know it takes up max CPU power because I'm just looping and drawing so background tasks probably don't get a chance to process, but how could I allow that? In fact the problem would seem to be with Flip 1 as while it's doing VWait it would be nice if it let other processes get in there (as there must be a lot of idle time there), or maybe just let them get in there just generally, say when the loop is going on.

I know I could stick in Delay(1) but that's a bit lame as it reduces the amount of CPU time given to the game if the game really needs it in a busy period, and this only drops it to 40%ish in task manager.

I don't want to make a MaxGUI-based game.

I could not draw every frame if nothing has changed (but then if the window is moved, or another window is moved over it, it won't "repaint" unless I tell it to specifically) except that most games have moving things all the time. I could only do logic if the user inputs something (mouse or keys) but then games have other stuff that need to be checked/processed all the time like moving aliens, animations, time limits, menu mouse overs etc so that doesn't seem feasible.

However, if I could reduce the CPU load, I'm sure that Windows would live in harmony with the game more and then the delta time (in my framework) wouldn't find it self having to catch up so much with big chunks of stolen CPU time from desperate processes instead of just lots of little bits of shared CPU time.

Anyway here's some test code.

Strict
Graphics 800,600,0

While Not KeyHit(Key_ESCAPE)
	Local n = 0 
	For Local i = 1 To 10000
		n:+1
	Next
	Cls
	DrawRect 100,100,200,200
	Flip 1
Wend


I hope what I've written makes sense, anyone got any feedback? Many thanks in advance.


Warren(Posted 2007) [#2]
With the Delay(1) in there, it's my understanding that the game will more willingly give up the processor if something else needs it. If you're running a tight loop like that then, yeah, it's going to stay high while your app is the only thing needing the processor. Windows won't downgrade your apps slice if it doesn't need to.


Grey Alien(Posted 2007) [#3]
Had an idea, maybe after doing the logic I could "guess" how much time in ms is left before the next VSync and call Delay for that long, and then call Flip after the delay. This way Windows would get lots of CPU time each frame depending on how busy the game is. Might be a bit tricky to work out though...


GfK(Posted 2007) [#4]
You don't really need to guess it. You can time how long your loop takes, then add an appropriate delay at the end.

So, say you've allowed 34ms (30fps), and your loop takes 10ms, just stick a Delay(24) in there.

I'm not sure how well this'd work in practice, though.


Dreamora(Posted 2007) [#5]
Its works quite good.
The con of this trick is that you will still get out of sync with vsync so I guess best to do is a flip 1 once per second to "get in sync" once again.

PS: to find out the actual number of ms, you can do a 300 flip sequence or the like and measure the avarage time per frame while vsync is enabled. 300 is just an example, but I would flip at least as much as the best monitor can ...


Grey Alien(Posted 2007) [#6]
Yeah I'd thought about those methods. Problem is if I just do a delay for the remainer of frame time left, there is no telling how long the actual flip takes, so I'd be best delaying for a smaller number by one or two for safety so that no flips are ever missed by accident (Dreamora: is that what you meant by getting out of sync with vsync?).

Problem with a 300 frame (or even 180) test is that's 5 seconds long so you can't just have your game sitting there with a black screen doing that or saying "loading" or whatever. So you think, OK I'll test the average on the title screen, but then what if it's animated and immediately needs to know the average frame length. Perhaps it would be worth testing for say 30 frames (roughly 0.5 seconds) on a black screen and then using that average and then refining it over time. I guess that it might have to "reset" the average every so often though because random delays caused by windows may gradually lower the average. Naturally you'd also have to track Alt+TAb and any other interference (e.g. window draggin) to make sure it doesn't get included in the average.


Grey Alien(Posted 2007) [#7]
Getting this right could be what's needed to make games smooth on a greater range of PCs.


Grey Alien(Posted 2007) [#8]
Just found out that if you use Flip 0 AND delay(1) you get loads of CPU time back (due to Delay being called very frequently), but not with Flip 1, which I was doing. But Flip 0 is undesireable due to vertical tearing unfortuntely.


Azathoth(Posted 2007) [#9]
You could set the priority of your program with the API, though it would still possibly use 50% if no other programs are busy.


Grey Alien(Posted 2007) [#10]
Azahoth: Done that. Well actually I've boosted it, not dropped it, in an effort to combat odd jerk problems people get using delta time which are caused by Windows getting too much CPU time all in one go. This works except that Windows can get a bit flakey and stop redrawing etc!


Grey Alien(Posted 2007) [#11]
This MS article could be relevent as to why jerking can occur when using delta time:

http://support.microsoft.com/default.aspx?scid=KB;EN-US;Q274323&

Futhermore someone else says:

"rdtsc "is a per-CPU operation, so on multiprocessor systems you have to be careful that multiple calls to rdtsc are actually executing on the same CPU.""

Could this mean that millisecs screws up on some systems then?


FlameDuck(Posted 2007) [#12]
Isn't this what PollSystem is for?


Grey Alien(Posted 2007) [#13]
KeyHit calls PollSystem. Adding a PollSystem line makes no difference to CPU usage.


FlameDuck(Posted 2007) [#14]
Ok. Btw:
Just found out that if you use Flip 0 AND delay(1) you get loads of CPU time back (due to Delay being called very frequently), but not with Flip 1, which I was doing.
This is because your drivers suck (ie. they "busy wait" until the top of the next frame, rather than "idle waiting").

There is no way you can change driver behavior, except either update them, or switch to a better product.


Grey Alien(Posted 2007) [#15]
they "busy wait" until the top of the next frame, rather than "idle waiting").
So it's not really much to do with the Delay statement but rather what my drivers do when VWaiting i.e. give control back to windows(idle) or hog it. I thought it was something like this. Are you saying that if we ran this test on other PCs, it would be different then? How do your drivers fare? If they idle wait, the code at the top should produce much lower CPU usage% without any calls to Delay.

There is no way you can change driver behavior, except either update them, or switch to a better product.
Wow so Radeon 9800XT sucks, oh well. Thanks for the info, thought it was something like that. Thing is lots of customers will be in the same boat so I have to design something that works for everyone.


FlameDuck(Posted 2007) [#16]
Are you saying that if we ran this test on other PCs, it would be different then?
Yes.

How do your drivers fare?
Not any better than yours, I'm afraid.

Wow so Radeon 9800XT sucks, oh well.
Just the drivers. Actually it may be a limitation of the Windows driver model, but not having written any drivers myself, I wouldn't know for sure.

Thing is lots of customers will be in the same boat so I have to design something that works for everyone.
Yup. Wintel development is a drag. Maybe Vista will address this, even though that won't immediately solve your problem for another 2 years.


AlexO(Posted 2007) [#17]
Timers are great. Sadly, with Flip 1 I couldn't come up with any solution that didn't hog my PC. I'm running a Radion9800XT also. Using timers below with Flip 0 brings CPU down to like 0-2% on a 3.2GHz HT-P4. Could even separate out logic and drawing with 2 timers, each on different frequencies.

Strict
Graphics 800,600,0

Local time:TTimer = CreateTimer(30)
Local lastTickCount:Int = 0

While Not KeyHit(Key_ESCAPE)
	WaitEvent()
	Local ticks:Int = TimerTicks(time)
	
	
	If ticks > lastTickCount Then
		lastTickCount = ticks
		Local n = 0 
		For Local i = 1 To 10000
			n:+1
		Next
		Cls
		DrawRect 100,100,200,200
		Flip 0
	End If
	
Wend



SculptureOfSoul(Posted 2007) [#18]

This MS article could be relevent as to why jerking can occur when using delta time:

support.microsoft.com/default.aspx?scid=KB;EN-US;Q274323&

Futhermore someone else says:

"rdtsc "is a per-CPU operation, so on multiprocessor systems you have to be careful that multiple calls to rdtsc are actually executing on the same CPU.""

Could this mean that millisecs screws up on some systems then?


If I recall correctly, Blitz doesn't call QueryPerformanceCounter & QueryPerformanceFrequency to determine the elapsed time, so this shouldn't be an issue.


Azathoth(Posted 2007) [#19]
If I recall correctly, Blitz doesn't call QueryPerformanceCounter & QueryPerformanceFrequency to determine the elapsed time, so this shouldn't be an issue.

Microsoft says to use QueryPerformanceCounter & QueryPerformanceFrequency instead of RDTSC.
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/directx9_c/Game_Timing_and_Multicore_Processors.asp


Dreamora(Posted 2007) [#20]
You don't have problems with that in BM. BM does simply not feature multi core so the potential problems simply can not arise. (nor can you fake around that, the GC will simply break if you think of beeing the cool one with one of those "multi threading" modules floating around)

But the main pro of using that way is the heavy drawback of the "old" method they mention: the old method is simply not capable of handling modern CPU correctly which use dynamic stepping. (I'm only on notebooks so this is a serious aspect for me not a funny side feature. I hate crap games that suck 100% cpu and stuff like that but run still normal when GPU fully stepped down as well as cpu fully stepped down)


Grey Alien(Posted 2007) [#21]
Hmm, interesting info, but still no closer to a solution for Flip 1 :-(

When I coded in C++ in DOS years ago I looked at CPU ticks not millisecs. I wonder if BMax was able to query other timers if they might be more accurate than millisecs?


Dreamora(Posted 2007) [#22]
Through API, most likely.
Problem is, that their "precision" is term of discussion ... they depend on stuff that is not that precise after all as it is not static. (you might know of the P4 HT / AMD X2 timing bugs when no stepping drivers are installed which is caused by this so called high precision timers)


Grey Alien(Posted 2007) [#23]
Ah I see, that sounds a bit rubbish.


FlameDuck(Posted 2007) [#24]
You don't have problems with that in BM. BM does simply not feature multi core so the potential problems simply can not arise.
Rubbish. While yes, BlitzMAX is a single threaded process, and thus not prone to concurrency problems, unless BlitzMAX specificly sets a the processor affinity to only run on a single processor/core, the process schedulers load balancing algorithm may chose to execute your program on another core/CPU in it's next time slice, depending on how much load the other core is under. Since the graphics drivers are busywaiting (that is 100% load) it's highly likely that your process is executed on seperate cores in each time slice.

Hmm, interesting info, but still no closer to a solution for Flip 1 :-(
Use Linux?


Grey Alien(Posted 2007) [#25]
I'm going back to DOS, or maybe get my A1200 out again.


Dreamora(Posted 2007) [#26]
Yeah on a Linux you wouldn't have such problems. Without hardware 3d support on usefull speed you wouldn't have to think about "waste waited" cpu cycles ;)


Grey Alien(Posted 2007) [#27]
Plus I'd get about 1 sale.


DStastny(Posted 2007) [#28]
I think you are worring about nothing in regards to the CPU Utilization. The fact that your render loop maxs out the CPU utilization in windows is not a bad thing.

Basically the way the loop is setup is to yield to the OS when you call Pollsystem, KeyHit in your example case. When you call Pollsystem under the covers of that it is yielding to the message queue by calling PeekMessage then GetMessage. Calling those windows APIs yields your process to the Task Scheduler and allow smooth accesses to the GUI. If you have any other processes running they will get all the CPU they need. If you dont yield the OS will preemept your process based upon priority and yield it for you. Although you may get sluggish GUI behavior.

Test this out with a console program that infinte loops vs a windows app that infinite loops. Youll notice no impact on the GUI with the console app but the GUI app will mung up the GUI interface.

There is nothing wrong with this and normal behavior. The fact that your Utilization is 100% for a multimedia style application is good. Alternativly you can use the event hook style and setup a timer that fires every x number of times per second. Or if you wish sleep your app, "Delay(1)" Although I wouldnt do that for Arcade style game a match 3 type that your famous for should be fine.

Most applications spend most of there time waiting for messages

Doug Stastny


Grey Alien(Posted 2007) [#29]
Budman: Yeah thanks for the feedback. Basically someone in another thread posted about one of my game demos taking a lot of CPU time, which initially I thought was fine too so that it didn't slow down or whatever, but then I thought about people who get "jerks" occasionally and wondered if the game wasn't yielding enough to Windows and thus it was "stealing" a big chunk of time at an inconvenient moment instead of little of little chunks. Maybe it's unavoidable for a BMax game to be interrupted by windows and other Processes, so I'll just have to look into code to smooth out the jerks. As for GUI apps, yeah I made quite a few in Delphi (as have you) so I understand that they sit there idling until something happens, but you can get them stuck in a loop and they stop processing input and redrawing as you say.


ImaginaryHuman(Posted 2007) [#30]
When you run your app you are not disabling multitasking. It is totally inevitable therefore that the o/s is going to interrupt your app and take some time to do something on a periodic basis. Unlike the old Amiga days where you could actually switch off multitasking altogether so you can hog the whole system, these days you can't really do that so you have to be more cooperative.


FlameDuck(Posted 2007) [#31]
get my A1200 out again.
Now there's an idea.

Maybe it's unavoidable for a BMax game to be interrupted by windows and other Processes, so I'll just have to look into code to smooth out the jerks.
No, it's unavoidable for any process running on a pre-emptive multitasking operating system. That's what pre-emptive means.


Grey Alien(Posted 2007) [#32]
It's just that some people get pretty bad jerking on scrolling esp. in Max but less so in some other games apparently...


ImaginaryHuman(Posted 2007) [#33]
Isn't Delay(1) supposed to take care of it?


Grey Alien(Posted 2007) [#34]
I'll make a test app I think using various different methods and see which has the best performance.


Dreamora(Posted 2007) [#35]
If you need non jerkiness, players just have to understand, that the description "DirectX Fullscreen Exclusive" is not just some PR gag ... either they want their game to get the exclusive focus of the OS with the resulting high performance or if they want other stuff doing at the same time which depending on what it is (most likely: if it uses 3D even if not active like having a movie capable media player running and the like) and with the consequences of this step.

After all, fullscreen ex isn't just for fun, windowed on the other end, never was meant for "full realtime" stuff ... So easiest is to restrict the windowed usage to the point where a max of X flips is done per second (60? Screen hz / 2 or in worst case even /3 ) so the app can't get jerky. This could be handled by some simple checks, for example if the average runtime of the mainloop has serious differences etc.

Just as a small example: Having iTunes running at the same time will nearly guarantee ANY game to start get jerky (even with dual core it is somehow able to impact the whole system. With some other apps running at the same time it can even hardlock the OS)


Grey Alien(Posted 2007) [#36]
Yeah the problem is "normal" users who may have stuff running and not really realise it (or spytware etc), and iTunes is a good example. Then they run my game and think, hmm this is a bit crappy and don't buy. Maybe that's why puzzle games do OK as they don't need constant smooth anim like a scrolling game does. Dropping the framerate is a good idea so jerks are less obvious but then it just looks crapper anyway, and is a shame for people who's PCs can run it no problems (like mine). What to do...? Sigh.


Warren(Posted 2007) [#37]
Just as a small example: Having iTunes running at the same time will nearly guarantee ANY game to start get jerky (even with dual core it is somehow able to impact the whole system. With some other apps running at the same time it can even hardlock the OS)

I can second the iTunes complaint. Even when it isn't playing a song, it robs my projects of frame rate constantly.


Grey Alien(Posted 2007) [#38]
worth knowing. Right now my system idle is 99% so no wonder stuff runs smoothly on my PC.