Crash (native)

Started by AeroShark333, July 10, 2017, 07:29:06 PM

Previous topic - Next topic

EgonOlsen

That's the one (or at least very similar) that I'm seeing for my own apps every now and then. But it doesn't really help. All the engine is doing, is to tell the GL driver to draw some geometry. Why this causes a crash deep inside of the GL pipeline under some circumstances is beyond me. Another problem is, that you can't be sure when this happen because there's no context. You don't know if this happens during normal operation (I've never ever experienced this on one of my own devices) or if it happens when the context is shutting down anyway (or doing some other wierd thing).

AeroShark333

I actually do have an emulator on my PC where I can reproduce this SIGSEGV error.
However, I'm not sure if the emulator crash truly is identical to the crashes that users receive.

In Android Studio, I've been using the debug option to test some things out.
I'd let the application stop once it hits  the following line :
gl11.glDrawElements(this.primitiveType, this.indexCount, 5123, 0);

It's not running the above line yet until I tell it to continue, the app basically hangs/suspends in meantime.
While the app is suspended, I can run an evaluation (custom code) that I can insert through the debugger as follows:
for(int xxx=3;xxx<this.indexCount;xxx+=3){
System.out.println("Indexcount=" + xxx);
gl11.glDrawElements(this.primitiveType, xxx, 5123, 0);
}


Now there are three different scenario's:
- The application crashes when a certain index count (xxx) has been reached. (lower limit)
- The application crashes for another index count (xxx+6). (higher limit)
- The application doesn't crash at all (because it does still happen randomly...)

So for an Object3D, there's always two values as index count where it could crash: xxx and xxx+6.

The only thing that changes this xxx value (and xxx+6 value) is:
- the vertex count of the object
- the overall structure of the object.

I have only tested this with UV-spheres and ico-spheres.

As a conclusion, I have found that if it was possible to lower these 'index counts', so that high values are not reached, the crash won't happen.
The only way to achieve that is to lower the Config.glBatchSize value.
And yes, with the emulator this does help to reduce the occurences of this SIGSEGV crash to 0.
I'm not sure if this will fix the issue for real devices but it's worth a shot I guess.

I have two questions remaining, however:
- Why is the default Config.glBatchSize = 8000?
- If an Object3D for instance has 48000 vertices and the Config.glBatchSize = 8000. The Object3D loads it in batches of 23997, 24000 and 3 vertices. Why is the first batch 3 vertices smaller than the batch size would suggest?


EgonOlsen

The 8000 is just a magic number. I had some trouble with higher values on the desktop using older nVidia drivers, so I tried to find a compromise between these problems and a reasonable high value. It then transfered over to the Android version. It might indeed help to lower it for some unknown reason.

About the strange batch sizes...I don't know ATM. I'll have a look at this.

EgonOlsen

A had a look at the batch size issue. It's a problem with the way in which batch sizes are calculated. It's only really obvious if the vertex count is a multiple of the batch size. I've fixed it, but I don't think that it requires a new version just for that. It looks strange in the log output, but it's not an actual problem.

AeroShark333

Alright, thank you. I figured it'd not make a big difference but I mostly wondered if this was intentional so I guess not.

About the ideal batch size, I have found Config.glBatchSize=3000 to be pretty ideal in my use cases when the vertex count for the Object3D is smaller than about 100000. The Config.glBatchSize needs to be further decreased when your Object3D has higher vertexcount. But this different per Object3D structure again...

In my use case the distance between vertices does become smaller for higher vertexcount... I don't know if that could be a problem that the vertices become more dense. I have noticed a stronger drop for the needed glBatchSize for higher polygoncount UV-spheres than for higher polygoncount ico-spheres.

EgonOlsen

Honestly, I've no idea about how and why the vertex count can cause an issue. As mentioned, I had some similar experience on the desktop and I debugged the hell out of it, thinking that my counts were somehow wrong and that I was rendering some out of bounds geometry. But that's not the case, so I really don't know what the issue is here...

AeroShark333

Offtopic but any clue what could cause this crash?

java.lang.RuntimeException:
  at com.threed.jpct.Object3D.render (Object3D.java:6595)
  at com.threed.jpct.World.renderScene (World.java:1079)
  at com.aeroshark333.artofearthify.lw.ArtOfEarthify.onDrawFrame (ArtOfEarthify.java:2331)
  at com.aeroshark333.artofearthify.lw.LiveWallpaperRenderer.onDrawFrame (LiveWallpaperRenderer.java:73)
  at android.opengl.GLSurfaceView$GLThread.guardedRun (GLSurfaceView.java:1571)
  at android.opengl.GLSurfaceView$GLThread.run (GLSurfaceView.java:1270)
  at com.threed.jpct.Logger.log
  at com.threed.jpct.Logger.log (Logger.java:150)
  at com.threed.jpct.World.renderScene (World.java:1095)
  at com.aeroshark333.artofearthify.lw.ArtOfEarthify.onDrawFrame (ArtOfEarthify.java:2331)
  at com.aeroshark333.artofearthify.lw.LiveWallpaperRenderer.onDrawFrame (LiveWallpaperRenderer.java:73)
  at android.opengl.GLSurfaceView$GLThread.guardedRun (GLSurfaceView.java:1571)
  at android.opengl.GLSurfaceView$GLThread.run (GLSurfaceView.java:1270)


Device: Samsung Galaxy S8, 3GB RAM, Android 9 (not my device)

EgonOlsen

Judging from the line in World, which causes the actual problem (1095), there should also be this output somewhere in the log:

Quote
There's a problem with the object list not being consistent during rendering. This is often caused by concurrent modification of jPCT objects on a thread different from the rendering thread!

Are you any chance modifying the world's collection of objects in some other thread like when a touch event happens?

AeroShark333

Quote from: EgonOlsen on September 21, 2020, 09:48:07 AM
Judging from the line in World, which causes the actual problem (1095), there should also be this output somewhere in the log:

Quote
There's a problem with the object list not being consistent during rendering. This is often caused by concurrent modification of jPCT objects on a thread different from the rendering thread!

Are you any chance modifying the world's collection of objects in some other thread like when a touch event happens?
I think that line is only printed with a NullPointerException but I'm not sure about this RuntimeException...

I'm not modifying the world's collection of objects at that point any more... So I am confused...

I don't have access to any other logs however... So I'm puzzled...
I wonder what the Object3D.java:6595 line is all about though

EgonOlsen

That puzzles me as well. This stacktrace doesn't make much sense as a whole. It somehow looks like as if two stacktraces have been mixed together, because you just can't go from GLSurfaceView$GLThread.run()...to Logger.log()...back to GLSurfaceView$GLThread.run(). Something is messed up here. 6595 in Object3D depends on the version you are using. If it's most recent beta from some weeks ago, then it happens while accessing the lights that this object is being influenced by for this frame. That's a temp list used during rendering. It can't change at this stage unless the object is fiddled around with in another thread.

But only the part until the Logger.log looks reasonable and that points to a nullpointer while processing the world's objects, which can actually only be caused by some other thread fiddling around with either this list or an object itself which would explain the problem in line 6595 as well.

AeroShark333

I'm using the latest beta (which has 8 texturelayers support)
I'm only using a single light source at [0,0,0], which doesn't ever change actually...

I don't think I'm changing any lighting in any other thread actually...
I believe the only thing that could be altered from other threads that might consider the Object3D is: camera position/orientation and shader variables (uniforms).

EgonOlsen

Quote from: AeroShark333 on September 21, 2020, 03:04:28 PM
I'm using the latest beta (which has 8 texturelayers support)
In that case, it's this line:


int id = (int) ((float[]) ls.get(i))[1];


i is the index into the lights list (ls). It ranges from 0 to the list's length (exclusive). 1 is an index into the float[] array that has been stored in this list. It can never be null, because it's created a startup (just as ls). However, both, the list and the array, are static instances in Object3D (yes, that's ugly but I had to do it that way to avoid object creation and garbage collection especially on older versions of Android). That means that any other thread calling render on any other Object3D will mangle this list. This has to be what happens here, I don't see any other way. Maybe it happens when the GL context/rendering thread changes because the device wakes up from sleep or something like that? So that the old thread hasn't been terminated while the new one already exists? I've no idea, I'm just guessing here...

AeroShark333

#87
I'm guessing then it's either context change, resuming of renderer or pausing of renderer...
I don't think it could be anything else..?

I believe I did have some issues with NullPointerExceptions before when resuming/pausing/changing context. I did catch it before and if I remember correctly the problem would fix itself..? So the NPE's would disappear shortly after such events if I remember correctly. But I removed catching the error/exception in this version because it also caught shader crashes which I didn't want to catch.

Is it possible to 'test' a written shader beforehand (when loading)? That it compiles well and could work well? I think I tried FrameBuffer.compileShader before but it doesn't throw a crash or error..?
Only when the Object3D, which is using the GLSLShader, is visible it'd crash.

I guess I could ignore it for now as it's only one occurrence...
But yeah, just letting you know

EgonOlsen

FrameBuffer.compileShader() compiles the shader. It just doesn't really execute it (how should it...). So you can test, if it compiles that way, but not if it really works.

AeroShark333

Hmmm, okay...

I think I'll fix it with:
frameBuffer.compileShader(shader, null);
                    frameBuffer.setBlittingShader(shader);
                    frameBuffer.blit(brightness, 0, 0, 0, 0, 2, 2, false);
                    frameBuffer.setBlittingShader(null);

to test every shader. This does seem to trigger errors/crashes and works for Object3D shaders too.

That way I can still catch the RuntimeException from World.renderScene/World.draw whenever it happens and not needing to blame it on faulty shaders because those can be checked beforehand now.
I'm still guessing the exception is thrown due to context pause/resume/change but I'm not entirely sure.