Terminology of Object3D

Started by jpro, December 09, 2011, 03:59:40 PM

Previous topic - Next topic

jpro

I have a prototype I've been working on, which I originally started in OpenGL and am trying to port over to JPCT. The majority of this transfer was quite painless - JPCT is way easier to use than base OpenGL calls, and I can only see the benefits getting larger as the project progresses. However, I've not yet been able to get the same performance as the OpenGL solution. This morning I was thinking that a big part of the problem is I don't exactly know what some of these Object3D functions do behind the scenes.

From a high level, I view an Object3D as a set of triangles that, when added to a World, will be rendered. But that's perhaps not even accurate, and there's also several functions that need to be called that are in a lot ways very black-box in their description.

Here are the functions I'm curious about:

Quote
public void build()
Initializes some basic object properties that are needed for almost all further processing. build() should be called if the object is "ready to render" (loaded, Textures assigned, placed, rendering modes set...).
Adding new triangles to an object after calling build() may work but doesn't have to. If you want to do this, call unbuild() before.

Based on this description this tells me an Object3D is initially in a state that is not renderable. At a high level what does build do to make it renderable?

Quote
public void compile()
Compiles an Object3D. Don't confuse this with build(), which does something completely different. A compiled object can be rendered faster when using the hardware renderer but has some shortcomings...

Based on the Javadoc I'm assuming this sets up the Object3D in a display list or VBO (depending on settings).

EgonOlsen

The desktop version can render Object3Ds without build() being called, but it's not advised to do this. build() basically does this:


  • calculate the bounding box
  • calculate the vertex normals
  • calculate center and rotation pivot
  • ...and some additional stuff that only the software renderer needs

If you don't call build(), vertex lighting won't work and gross culling isn't performed.


compile() bascially does what you assume: It processes an Object3D in a way so that it can be rendered by the hardware more efficiently, i.e. it optimizes the object structure, groups object parts and create a vertex array/display list/vertex buffer for each group. It's perfectly fine not to compile an Object3D, but you are using a completely different render pipeline then, which is more flexible but much slower than the more hardware oriented.

What exactly are you rendering and how much slower is "slower"? It will be slower simply because there's more overhead compared to just pushing render calls to the GPU, but it actually should pay off the more complex the scenes get. On a multi core setup, you can also enable multi threading: http://www.jpct.net/wiki/index.php/Multithreading (Just keep in mind to use an IPaintListener to count the fps then or you'll only count the processing but not the rendering).

jpro

#2
I'm procedurally generating a world with a very blocky shape (think Minecraft or Infiniminer). The performance issues I'm pretty sure come from too many Object3Ds in the scene. Even with a small chunk of terrain like a single 16x16x16 chunk, there are 1536 tiles to render, assuming you cut out the interior faces. As more chunks get added, say even a tiny playable area like 6 chunks, it potentially gets into the couple thousand range of tiles, each its own Object3D. What makes sense from a design perspective (each face represented by an individual plane) doesn't seem to work when rendered in large numbers. If this sounds familiar it's because I made a post on this same issue almost a year ago.

Correct me if I'm wrong but my thinking is that tiles in a chunk need to be merged and recompiled so instead of 1536 Object3Ds you have one. Rendering 6 chunks isn't 5000 Object3Ds but 6. When I look at some of the other demos, which all render with great framerates, the main difference I notice is that while objects might have a lot of triangles within them, they are still just one contained Object3D. There clearly seems to be a disadvantage to rendering too many discrete objects in a scene. And since you can then compile the Object3D, you've got the entire chunk, all ~1000 faces of it, on a single display list (or VBO).

Edit: I feel like an idiot when I post on this subject. I spend a lot of timing writing web application code and then I try to render a simple graphics scene and I'm really out of my element. Perhaps that's what draws me to this time and time again.

EgonOlsen

Yes, rendering them all as single objects isn't very efficient...but neither is recreating/recompiling them all the time. This is quite a special case to render and it isn't covered very well by what is there. raft once had a similar problem with a kind of platform game for Android. I suggested a hack that turned out to work pretty well, but it still feels hacky...the idea is to create one large Object3D (or maybe a set of Object3Ds for different kinds of blocks) with a fixed number of entities (in this case blocks). Then compile these as animated objects and use an IVertexController to shift away the blocks that you don't need (by giving them some absurdly high coordinates for example)/shift the blocks that you need to the positions where they belong...I'm not sure if it's clear what i mean and when i suggested this to raft, i was half meant to be a joke...and it still feels a little bit like one even if it worked fine.
I'm not sure if writing a special engine optimized for this special case wouldn't be a better solution. However, i can try to setup a simply example that uses this idea to see how it works out if you are interested...

jpro

#4
Chunks would need to be redrawn and recompiled each time a chunk loads next to them or a block in the chunk changes. Is it expensive to recreate an object with ~3000 triangles? Only the chunk the player is in would ever need to be recreated in this manner.

I have to think a similar thing needs to be accomplished whether the rendering is accomplished through JPCT or just OpenGL. A chunk is made up of display lists at some point, whether that is

each chunk is 6 display lists, one for each tile side
each chunk tile is its own display list
every chunk tile is in a single display list

There must be a good middle ground that makes sense.

---

Thinking on the idea you posted, it does seem very hackish. Basically what that says to me is that it's more performant to leave vertices in the world vs. recompiling the object that contains them. If so, would it be less hackish to simply leave the tile face in, but change it to be completely transparent if it no longer needs to be rendered?

EgonOlsen

No, moving it out of sight is faster then setting it to transparent. About recreating the object...i'm not sure how expensive that is. There's a bulk constructor in Object3D that might come in handy for this, but it still might suck. It's worth a try though.

Anyway, i went ahead and implemented a simple example that shows the basic idea (albeit without moving vertices out of sight, all cubes are visible all the time). It renders 1331 (11*11*11) blocks using a simple translation.

First, the version where each cube is a single Object3D. This version renders @~110fps on my machine:


package com.threed.jpct.demos.blocks;

import java.util.*;

import com.threed.jpct.*;
import com.threed.jpct.util.*;

/**
* Blocks done the "normal" way...
*
* @author EgonOlsen
*
*/
public class Blocks {

private World world;
private FrameBuffer buffer;
private List<Object3D> cubes;
private Object3D cube;
private float step = 1f;

public static void main(String[] args) throws Exception {
new Blocks().loop();
}

public Blocks() throws Exception {
world = new World();
world.setAmbientLight(100, 100, 100);

Light light = new Light(world);
light.setIntensity(255, 255, 255);
light.setPosition(new SimpleVector(0, 0, -100));

cubes = new ArrayList<Object3D>();

cube = Primitives.getCube(2);
cube.rotateY((float) (Math.PI / 4d));
cube.rotateMesh();
cube.clearRotation();
cube.build();

for (float y = -5; y < 6; y += step) {
for (float x = -5; x < 6; x += step) {
for (float z = -5; z < 6; z += step) {
Object3D c = new Object3D(cube, true);
c.shareCompiledData(cube);
c.compile();
c.build();
c.translate(x * 10, y * 10, z * 10);
cubes.add(c);
c.addParent(cube);
world.addObject(c);
}
}
}

world.getCamera().setPosition(0, 0, -150);
}

private void loop() throws Exception {
buffer = new FrameBuffer(800, 600, FrameBuffer.SAMPLINGMODE_GL_AA_2X);
buffer.disableRenderer(IRenderer.RENDERER_SOFTWARE);
buffer.enableRenderer(IRenderer.RENDERER_OPENGL);

int fps = 0;
long time = System.currentTimeMillis();
int cnt = 0;

while (!org.lwjgl.opengl.Display.isCloseRequested()) {

boolean plus = (cnt / 100) % 2 != 0;
for (Object3D cube : cubes) {
SimpleVector trsn = cube.getTranslation();
if (trsn.length() > 2) {
trsn = trsn.normalize();
trsn.scalarMul(0.5f);
if (!plus) {
trsn.scalarMul(-1f);
}
cube.translate(trsn);
}
}
cnt++;

cube.rotateY(0.01f);
buffer.clear(java.awt.Color.BLUE);
world.renderScene(buffer);
world.draw(buffer);
buffer.update();
buffer.displayGLOnly();
fps++;
if (System.currentTimeMillis() - time >= 1000) {
System.out.println(fps + "fps");
time = System.currentTimeMillis();
fps = 0;
}
}
buffer.disableRenderer(IRenderer.RENDERER_OPENGL);
buffer.dispose();
System.exit(0);
}
}



Second, the version where all cubes are merged into a mega cube. This version renders @~1500fps on my machine:


package com.threed.jpct.demos.blocks;

import java.util.*;

import com.threed.jpct.*;
import com.threed.jpct.util.*;

/**
* Blocks done the "hacky" way...
*
* @author EgonOlsen
*
*/
public class Bloxx {

private World world;
private FrameBuffer buffer;
private List<ObjectData> cubes;
private Object3D cube;
private Object3D megaCube;
private float step=1f;

public static void main(String[] args) throws Exception {
new Bloxx().loop();
}

public Bloxx() throws Exception {
world = new World();
world.setAmbientLight(100, 100, 100);

Light light = new Light(world);
light.setIntensity(255, 255, 255);
light.setPosition(new SimpleVector(0, 0, -100));

cubes = new ArrayList<ObjectData>();

cube = Primitives.getCube(2);
cube.rotateY((float) (Math.PI / 4d));
cube.rotateMesh();
cube.clearRotation();
cube.build();

List<Object3D> tmpCubes=new ArrayList<Object3D>();

for (float y = -5; y < 6; y += step) {
for (float x = -5; x < 6; x += step) {
for (float z = -5; z < 6; z += step) {
Object3D c = new Object3D(cube, true);
c.build();

ObjectData od=new ObjectData();
od.translate.set(x * 10, y * 10, z * 10);
cubes.add(od);

tmpCubes.add(c);
}
}
}

megaCube=Object3D.mergeAll(tmpCubes.toArray(new Object3D[0]));
megaCube.compile(true);
megaCube.build();
world.addObject(megaCube);

megaCube.addParent(cube);

megaCube.getMesh().setVertexController(new CubeTransformer(), false);

world.getCamera().setPosition(0, 0, -150);
}

private void loop() throws Exception {
buffer = new FrameBuffer(800, 600, FrameBuffer.SAMPLINGMODE_GL_AA_2X);
buffer.disableRenderer(IRenderer.RENDERER_SOFTWARE);
buffer.enableRenderer(IRenderer.RENDERER_OPENGL);

int fps = 0;
long time = System.currentTimeMillis();
int cnt=0;

while (!org.lwjgl.opengl.Display.isCloseRequested()) {

boolean plus = (cnt / 100) % 2 != 0;
for (ObjectData cube : cubes) {
SimpleVector trsn = new SimpleVector(cube.translate);
if (trsn.length() > 2) {
trsn = trsn.normalize();
trsn.scalarMul(0.5f);
if (!plus) {
trsn.scalarMul(-1f);
}
cube.translate.add(trsn);
}
}
cnt++;

megaCube.getMesh().applyVertexController();
megaCube.touch();

cube.rotateY(0.01f);
buffer.clear(java.awt.Color.BLUE);
world.renderScene(buffer);
world.draw(buffer);
buffer.update();
buffer.displayGLOnly();
fps++;
if (System.currentTimeMillis() - time >= 1000) {
System.out.println(fps + "fps");
time = System.currentTimeMillis();
fps = 0;
}
}
buffer.disableRenderer(IRenderer.RENDERER_OPENGL);
buffer.dispose();
System.exit(0);
}

private class CubeTransformer extends GenericVertexController {

private static final long serialVersionUID = 1L;

@Override
public void apply() {
SimpleVector[] toMesh=this.getDestinationMesh();
// SimpleVector[] toNormals=this.getDestinationMesh();

SimpleVector[] fromMesh=this.getSourceMesh();
// SimpleVector[] fromNormals=this.getSourceMesh();

int cubeCnt=-1;
SimpleVector translate=null;
SimpleVector tmp=new SimpleVector();

for (int i=0; i<fromMesh.length; i++) {
if (i%10==0) {
cubeCnt++;
translate=cubes.get(cubeCnt).translate;
}
tmp.set(fromMesh[i]);
tmp.add(translate);

toMesh[i].set(tmp);
}
}
}

private static class ObjectData {
SimpleVector translate=new SimpleVector();
Matrix rotation=new Matrix();
}
}


There are a few pitfalls with the second version of course...for example, all translations and rotations (which i'm not using in this example) have to be done in the IVertexController in object space instead of world space like in the first example. In this simple case, this actually doesn't matter though. I'm not going to say that this is the way to go, i just wanted to illustrate the idea.

EgonOlsen

Note: For the second example, you might want to use the latest beta jar from here: http://jpct.de/download/beta/jpct.jar. It makes animations default to VBOs instead of vertex arrays, which should be faster.

jpro

#7
Awesome, thanks for the code sample. I'll admit I am reluctant to use something like that. It does indeed seem very hackish, although it is difficult to argue with results. I'll work on something that combines each chunk into a single or low number of objects and compare to see which one of these solutions comes out ahead.