Multithread/Multicore Software Rendering?

Started by kevglass, September 22, 2009, 11:13:12 AM

Previous topic - Next topic

kevglass

Sorry if this has been asked before, I can't find the answer on the wiki or through searching the forums.

I'm using the software renderer at the moment (very cool btw!) and was wondering if it's possible to use more than one thread/core to generate the output image if available. At the moment I'm seeing 100% usage of a single CPU on multicore machines.

Thanks for any help (and for the wonderful jpct)

Kev

PS. Just experiementing at the moment, will write up on cokeandcode soon :)

EgonOlsen

Well, i attempted to add support for multiple cores/cpus a few times in the past. Always with a different approach (split per polygon, split per scan line, split the screen into different sections)...and none was really satisfying. With version 1.19, i tried another, much simpler, approach to add at least some support for it to the software renderer. You can try it by using this version: http://www.jpct.net/download/beta/jpctapi_119pre1.zip
If you set Config.useMultipleTreads to true, the software renderer will use two cores for a tiny little bit of the rendering. The benefits aren't very impressive right now (if any...)...it helps most when using oversampling. It's still experimental, so it may improve in the future.

EgonOlsen

BTW: It may even hurt performance to enable it... :-\

kevglass

Thanks for that, performance is pretty reasonable for me at the moment, but I'm just trying out options :)

I thought the sectioning of the screen approach would be best. I'll certainly give it a try.

Not sure why I haven't tried JPCT before, it's really very light and easy. Very cool!

kev

EgonOlsen

Quote from: kevglass on September 22, 2009, 04:51:16 PM
I thought the sectioning of the screen approach would be best. I'll certainly give it a try.
Yes, i will try that myself again in the next few days and see what happens...

EgonOlsen

I did some tests and it works out pretty well so far. With a simple test case on a Core2 Quad@3.2Ghz:

1 core:    85fps
2 cores: 140fps
3 cores: 180fps
4 cores: 210fps

kevglass


kevglass

Let me know when you have something to test, I've only got a little test racer atm, but I'd be really interested to see the results on my local box with multithreaded rendering.

http://www.cokeandcode.com/applets/jpct

Kev

EgonOlsen

Nice applet...and here you go: http://www.jpct.net/download/beta/jpctapi_119pre2.zip

It's all pretty experimental, but at least on my machine, the results are promising so far.

Enable it with:


Config.useMultipleThreads=true;


set the number of cores to use (i can't do this automatically, because only Java 1.4 and higher can access this information):


Config.maxNumberOfCores=<int>;


Enable a debugging output in the rendered scene that shows the load balacing, i.e. which core processes which part:


Config.mtDebug=true;


And choose between static and dynamic (fun to watch when mtDebug=true... ;D) load balancing:


Config.loadBalancingStrategy=0/1;


Please let me know how it works for you...

kevglass

Frickin awesome, I'll upload the new applet in a moment. I get a 60fps->100fps increase, about 66%? 2 Cores here so thats fantastic.

Great work!

Kev

kevglass

I've updated the applet with the new pre-release. I've got a bunch of folks on IRC testing it out, should get some more figures soon.

Kev

EgonOlsen

#11
Cool! I've updated that 1.19pre2 version with another one that should improve dynamic load balancing (in case you are actually using it) for Java5 and above.

kevglass

Ok, strange results here.

I'm using:

      Config.useMultipleThreads = true;
      Config.loadBalancingStrategy = 0;
      Config.maxNumberOfCores = Runtime.getRuntime().availableProcessors();

On my box (2 processors, 1 core each) it seems to utilise both.
On another box (1 process, 2 cores) it seems to only use one thread
On a third box (4 process, 2 cores each) it seems to use each processor, but only at 40%ish.

I've had people check what Runtime.getRuntime().availableProcessors() returns, each time it returns the number of hardware threads available (equal to the cores). So above 2, 2 and 8.

Should I be able to use more cores than processors?

Kev

kevglass

The guy (kappaOne) with 4 processors reports that each one is being utilised to about 30-35%.

Kev

EgonOlsen

It's actually pretty dumb...it spawns as many worker threads as you tell it in Config.maxNumberOfCores. You may as well configure it for 32 threads (as pointless as that may be on today's hardware) and it will use 32 threads for rendering. You can easily visualize this by using Config.mtDebug=true;
In addition to the rendering, it uses up to two threads for clearing the frame- and zbuffers. Anything more than that doesn't make any sense, but it will reduce cpu usage on manycores as well as displaying the rendered image in the browser does, because not all cores will be fully utilized in that stage.
How the threads are scheduled to the cores/virtual cores is up the OS...and you'll get some overhead from the thread management itself. On my quad core, your applet uses around 55% of all four cpus. That a little less than my test cases do, but those don't include any game play and no browser overhead.