how to synch wav audio and 3D animation?

Started by tjm, April 10, 2010, 02:20:54 PM

Previous topic - Next topic

tjm

Hi all.

This is really a 'any ideas?' sort of post .....

What's the best approach would be for synching an audio track to an animatio?  A use case here would be lip synching an avatar to dialog.

The audio is WAV format, being played with Paul's SoundSystemJPCT, and the game loop and timing is derived from from the Slick2D libraries, and uses LWJGL's Sys class to generate deltas as milliseconds. It all works OK, but various latencies and system loads mean the audio and animation can get out of synch very easily.

If it helps, I'm using Acid Pro 7 to sequence and create the audio.

My thoughts at the moment are to create time markers, then

  • at runtime calculate the number of bytes to be played until the next marker,
  • count bytes
  • call a listener method

or to do the audio as MIDI, which would entail

  • implement a custom MIDI sequencer
  • intercept 'Wire' protocol data
  • call listeners for 'note on' and 'note off' messages

Does anyone have an other ideas?


On a slightly different topic, anyone interested in using a Wiimote for game input? If there's interest I'll post my sample code and details of setting it all up.

--Tim.

paulscode

In the latest version of SoundSystem (which I hope to have ready this weekend), I added events for receiving messages when the end of a stream is reached.  I'll look into expanding on this idea to make it so events are sent out each time a buffer is added to the stream.  This information could be used to determine where you are in the stream periodically, and then use that information to calculate slight alterations to either the animation speed or the streaming source's pitch to realign the two.

tjm

QuoteI'll look into expanding on this idea to make it so events are sent out each time a buffer is added to the stream.  This information could be used to determine where you are in the stream periodically, and then use that information to calculate slight alterations to either the animation speed or the streaming source's pitch to realign the two.

Paul, that would be great!!  Could you post a URL for a 'preview' if the release isn't ready? I can poke around, look at the messaging you've implemented, and possibly add something for messages on buffer additions.

--Tim.

paulscode

#3
Ok, will do.  I'll post a link in a little bit.

--EDIT--
I will need to figure out how to determine a value before I can fully implement this (bytes of current buffer unprocessed at time of buffer queuing).  I know how to do this in JavaSound, but I'll have to do some research to figure it out in OpenAL.  The 'preview' may only have this concept implemented for JavaSound, depending on whether or not I can google up how to do this in OpenAL without having to post a question on various dev. forums.

tjm

Paul,

I've been poking around the SoundSystem code this morning, doing a bit of OpenAL research, and it looks like calculating playback time for a streaming source should be fairly straightforward.

My initial thoughts are to add a method to the ChannelLWJGLOpenAL class, maybe something like 'playTime' that would rely AL11. alGetSourcei(source, AL_SAMPLE_OFFSET) to return actual playback time, calculated using the number of samples played and the sample rate.

Assuming the application can get a reference to the channel object, the application could simply poll the channel from the game loop.

Sound reasonable?

--Tim.

tjm

#5
Think I have a workable solution coded, but not tested. There are 2 parts to it.

First part .... modified method ChannelLWJGLOpenAL.queueBuffer

    public boolean queueBuffer( byte[] buffer )
    {
        // Stream buffers can only be queued for streaming sources:
        if( errorCheck( channelType != SoundSystemConfig.TYPE_STREAMING,
                        "Buffers may only be queued for streaming sources." ) )
            return false;

        ByteBuffer byteBuffer = ByteBuffer.wrap( buffer, 0, buffer.length );

        IntBuffer intBuffer = BufferUtils.createIntBuffer( 1 );

        AL10.alSourceUnqueueBuffers( ALSource.get( 0 ), intBuffer );
        if( checkALError() )
            return false;

// TJM -- based on concepts from: http://kcat.strangesoft.net/alffmpeg.c
//get size of unqueued buffer,
//calc number of samples it contains,
//calc milli duration of num samples in buffer
//increment position in millis relative to played buffers
//    using ALFormat and sampleRate
//    where ALFormat  is AL10.AL_FORMAT_MONO8  | AL10.AL_FORMAT_MONO16 |
//                               AL10.AL_FORMAT_STEREO8  | AL10.AL_FORMAT_STEREO16

float bufSize = AL10.alGetBufferf(intBuffer.get(0), AL10.AL_SIZE);

switch (ALFormat) {
case AL10.AL_FORMAT_MONO8 : millisPlayed += ( bufSize/8f ) / (float)sampleRate;
break;
case AL10.AL_FORMAT_MONO16 : millisPlayed += ( bufSize/16f ) / (float)sampleRate;
break;
case AL10.AL_FORMAT_STEREO8 : millisPlayed += ( bufSize/(8f*2f) ) / (float)sampleRate;
break;
case AL10.AL_FORMAT_STEREO16 : millisPlayed += ( bufSize/(16f*2f) ) / (float)sampleRate;
break;
default : break;

}

        AL10.alBufferData( intBuffer.get(0), ALformat, byteBuffer, sampleRate );
        if( checkALError() )
            return false;

        AL10.alSourceQueueBuffers( ALSource.get( 0 ), intBuffer );
        if( checkALError() )
            return false;

        return true;
    }



then the Second part .... a new method in Source

public int getMillisPlayed() {
return (-1);
}


and an overridden version of the method in SourceLWJGLOpenAL

public int getMillisPlayed() {

if( !toStream ) {
return (AL10.AL_INVALID);
}

       AudioFormat audioFormat = codec.getAudioFormat();

// get number of samples played in current buffer
int offset = AL11.alGetSourcei(channelOpenAL.ALSource.get( 0 ),
AL_SAMPLE_OFFSET);
// divide that by the sample rate to get duration in millis
offset = offset/channelOpenAL.sampleRate;

// add the Channel's  millisPlayed value
offset+= channelOpenAL.millisPlayed;

       // Return millis played:
       return millisPlayed;
   }



Think that'll do the trick ..... hopefully will try it out tomorrow.

--Tim.

paulscode

#6
I'll look into this, however I at first glance this looks like it is designed for single-buffer al sources (what I would call a "normal channel").  The question is how to make it work in a situation where an al buffer (what I call a "source's clip") played through an al source (what I call a "channel") changes (does it reset to the beginning?), or when multiple al buffers are continuously queued to an al source (what I call a "streaming source" playing through a "stream channel") - where does it start counting from, and how does one reset the counter when looping a stream or switching to next item in the play queue?  I'm sure there are answers to these questions - I just need to figure them out.

The other important point, whatever I do in OpenAL I must be able to do something equivalent in JavaSound before it can be added to the SoundSystem library.  I expect there are equivalent methods in the JavaSound Clip and SourceDataLine classes, I'll just have to find them.

--EDIT--  I'll look into your code to see if it can be adapted to the SoundSystem.  One immediate problem I see is that you have assumed the source is attached to a channel, which is not necessarily true (may not have been played, may have reached the end, may have been shut down to free up a channel for another source).  Additionally, thread synchronization will need to be considered.  Looks like a good start, though.

tjm

Thanks Paul. I have something sort of working but there are problems, which could largely be a result of me not really understanding SoundSystem or OpenAL.

Anyway, the major problems encountered so far are 1) the buffers appear to be continuously queued and unqueued even if a source is not playing, 2) my 'played millis' calculationsa are faster than actual playback by around 30% (are the buffers 100% full? or is this an artifact of the WAV codec trimming buffers? maybe just bad math?) and finally 3) there's a fair bit of latency between the source starting to play and getting a reference to the source object via the library (around 300-500 millis .... an artifact of the multi-threading?)

I'll keep plugging away at it, but my feeling at this point is that my current approach may not work.

--Tim.

paulscode

Quote from: tjm on April 13, 2010, 10:21:42 PM1) the buffers appear to be continuously queued and unqueued even if a source is not playing,
This should not be happening.  I'll look into this to see what's going on.  I assume you are talking about a streaming source, right? (OpenAL terminology is different than mine - in OpenAL, "buffer queuing" is done on all sources)

Quote from: tjm on April 13, 2010, 10:21:42 PM2) my 'played millis' calculationsa are faster than actual playback by around 30% (are the buffers 100% full? or is this an artifact of the WAV codec trimming buffers? maybe just bad math?)
Buffers are only trimmed once per clip (when the end of the data is reached), so that should not be the problem.  I'll look into this to make sure buffer sizes are correct (I recall checking this at one point, but it never hurts to check again).

Quote from: tjm on April 13, 2010, 10:21:42 PM3) there's a fair bit of latency between the source starting to play and getting a reference to the source object via the library (around 300-500 millis .... an artifact of the multi-threading?)
This is the result of two things.  Yes, one is multi-threading.  Calling a source creation method sticks a command into the queue which is processed in the order that it was queued.  It may have to wait for a previous command or for something else going on behind the scenes that causes the CommandThread to block for a short while.  Once a streaming source is created and set to play, the StreamThread is woken up and takes over, which again may add a few milliseconds of delay if that thread was busy with something else.  However, most of this initial delay is caused by reading in the initial stream buffer(s) (kind of like the "buffering.." message you get when you first start an online video stream before the video actually starts playing).  My setup may not necessarily be the best way, so I am always open to suggestions to improve the library.  This small delay issue is one of the problems I had when trying to sync audio and video in my "video texture" project a while back.  I never got around to working on a solution, so this is definitely something I am interested in solving.  I'm thinking that buffer queuing events might be the way to go, because they would not only detect when things start getting out of sync, but they would also let you know exactly when the sound actually began playing.

tjm

Quote1) the buffers appear to be continuously queued and unqueued even if a source is not playing,
Yes, a SoundSystem streaming source.

If my understanding of SoundSystem is correct, once a OpenAL processes buffers for a SoundSystem streaming sources, those OpenAL buffers are unqueued, refilled, then requeued again by the SoundSystem Channel.

But doesn't OpenAL mark all streaming buffers as 'processed' regardless of the actual processed state? Pretty sure I read that it does in the OpenAL programmer's guide.

QuoteBuffers are only trimmed once per clip (when the end of the data is reached)
I was thinking more along the lines of the WavCodec trimming it's notion of a buffer, which isn't necessarily the same size as the OpenAL buffer. The result of that would be this OpenAL buffer isn't 100% full, so this call

float bufSize = (float)AL10.alGetBufferi(intBuffer.get(0), AL10.AL_SIZE);

doesn't accurately reflect the audio data in the buffer, hence the cumulative errors I'm seeing.

But I'm just speculating 'cause I really don't know.

Quote
Quote
there's a fair bit of latency between the source starting to play and getting a reference to the source object via the library
This is the result of two things.  Yes, one is multi-threading
Thanks for that ..... good to know. Not sure if that's a show-stopper for me yet.

All in all, I think the SoundSystem is a pretty good framework. I'm just tracking method calls making guesses, trying to get this to work ..... and really have no clear idea what I'm doing.

--Tim.

paulscode

Quote from: tjm on April 14, 2010, 11:13:35 PMIf my understanding of SoundSystem is correct, once a OpenAL processes buffers for a SoundSystem streaming sources, those OpenAL buffers are unqueued, refilled, then requeued again by the SoundSystem Channel.
Yes, that is correct.

Quote from: tjm on April 13, 2010, 10:21:42 PM1) the buffers appear to be continuously queued and unqueued even if a source is not playing
Quote from: tjm on April 14, 2010, 11:13:35 PMBut doesn't OpenAL mark all streaming buffers as 'processed' regardless of the actual processed state? Pretty sure I read that it does in the OpenAL programmer's guide.
Possibly, but the result of that would be for OpenAL to report that it is further ahead in the stream than it actually is.  It wouldn't explain buffers being continuously queued and unqueued whenever a source is not playing.  When a streaming source is paused or stopped or hasn't been played yet, there should not be any data being read in or buffers being created.  I haven't had time to look at this yet, but it sounds like a bug to me.

Quote from: paulscode on April 14, 2010, 12:26:25 AMBuffers are only trimmed once per clip (when the end of the data is reached), so that should not be the problem.
Quote from: tjm on April 14, 2010, 11:13:35 PMI was thinking more along the lines of the WavCodec trimming it's notion of a buffer, which isn't necessarily the same size as the OpenAL buffer.
Just to clarify what I meant, the codec plug-ins do not trim any buffers except possibly the last one.  What they return is a chunk of uncompressed PCM audio data that is always the number of bytes returned by SoundSystemConfig.getStreamingBufferSize() unless there is not enough data left to fill a buffer of that size (i.e. the last chunk of data).  Once the data is passed to OpenAL, I can't say what happens to it after that.  Perhaps it does get smaller, but I kind of doubt it - the data passed to OpenAL is your basic, ready-to-use, uncompressed PCM audio data, so unless OpenAL encodes or compresses it somehow, it shouldn't shrink in size.  I'll look and see if there is a way to determine the size in bytes of an OpenAL buffer, so we'll know for sure if it is smaller than the chunk of data used to create it.

Quote from: paulscode on April 14, 2010, 12:26:25 AMone is multi-threading
Quote from: tjm on April 14, 2010, 11:13:35 PMNot sure if that's a show-stopper for me yet.
I'm guessing that decision will probably depend on if I can figure out a reliable way to sync audio.  This is definitely high on my list of priorities - something I need to be able to do for my game as well.

tjm

Quote
Quote
Quote
1) the buffers appear to be continuously queued and unqueued even if a source is not playing
But doesn't OpenAL mark all streaming buffers as 'processed' regardless of the actual processed state? Pretty sure I read that it does in the OpenAL programmer's guide.
Possibly, but the result of that would be for OpenAL to report that it is further ahead in the stream than it actually is.  It wouldn't explain buffers being continuously queued and unqueued whenever a source is not playing.  When a streaming source is paused or stopped or hasn't been played yet, there should not be any data being read in or buffers being created.  I haven't had time to look at this yet, but it sounds like a bug to me.

This one is solved ... it's user error :-[

I have two versions of the same WAV file, one padded with silence and one without silence. I was loading from the package that contains the silence-padded file.

--Tim.

tjm

Paul,

I just emailed you a working solution for streaming sources, then realized the methods and variables are missnamed! Calculations are returned as seconds, not milliseconds.

It hasn't been put through rigorous testing, but initial tests show it returns elapsed play time of accuracy of about 10ms for short audio files (10 seconds or less), a few milliseconds for long audio files (5 minutes). The inaccuracy is probably due to rounding errors.

--Tim.


paulscode

Ok, thanks.  Sorry I haven't had any time to look at this like I planned (I've been really busy with work, and had to go in the past couple of weekends as well, so no time for programming)