Why does GetDeliveryBuffer blocked with an INTERLEAVE_CAPTURE mode AVI Mux? - directshow

I'm trying to use a customized filter to receive video and audio data from a RTSP stream, and deliver samples downstream the graph.
It seems like that this filter was modified from the SDK source.cpp sample (CSource), and implemented two output pins for audio and video.
When the filter is directly connected to an avi mux filter with INTERLEAVE_NONE mode, it works fine.
However, when the interleave mode of avi mux is set to INTERLEAVE_CAPTURE,
the video output pin will hang on the GetDeliveryBuffer method (in DoBufferProcessingLoop) of this filter after several samples have sent,
while the audio output pin still works well.
Moreover, when I inserted an infinite pin tee filter into one of the paths between the avi mux and this source filter,
the graph arbitrarily turned into stop state after some samples had been sent (one to three samples or the kind).
And when I put a filter that is just an empty trans-in-place filter which does nothing after the infinite tee,
the graph went back to the first case: never turns to stop state, but hang on the GetDeliveryBuffer.
(Here is an image that shows the connections I've mentioned like)
So here are my questions:
1: What could be the reasons that the video output pin hanged on the GetDeliveryBuffer ?
In my guess it looks like the avi mux caught these sample buffers and did not release them until they are enough for interleaving,
but even when I set the amount of video buffers to 30 in DecideBufferSize it will still hang. If the reason is indeed like that, so how do I decide the buffer size of the pin for a downstream avi muxer ?
Likely a creation of more than 50 buffers of a video pin is not guaranteed to work because the memory size cannot be promised. :(
2: Why does the graph goes to stop state when the infinite pin tee is inserted ? And why could a no-operation filter overcomes it ?
Any answer or suggestion is appreciated. Or hope someone just give me some directions. Thanks.

Blocked GetDeliveryBuffer means the allocator, you are requesting a buffer from, does not [yet] have anything for you. All media samples are outstanding and are not yet returned back to the allocator.
An obvious work around is to request more buffers at pin connection and memory allocator negotiation stage. This however just postpones the issue, which can very much similarly appear later for the same reason.
A typical issue with a topology in question is related to threading. Multiplexer filter which has two inputs will have to match input streams to produce a joint file. Quite so often on runtime it will be holding media samples on one leg while expecting more media samples to come on the other leg on another thread. It is assumes that upstream branches providing media samples are running independently so that a lock on one leg is not locking the other. This is why multiplexer can freely both block IMemInputPin::Receive methods and/old hold media samples inside. In the topology above it is not clear how exactly source filter is doing threading. The fact that it has two pins make me assume it might have threading issues and it is not taking into account that there might be a lock downstream on multiplexer.
Supposedly source filter is yours and you have source code for it. You are interested in making sure audio pin is sending media samples on a separate thread, such as through asynchronous queue.

Related

GMFBridge DirectShow filter SetLiveTiming effect

I am using the excellent GMFBridge directshow family of filters to great effect, allowing me to close a video recording graph and open a new one, with no data-loss.
My original source graph was capturing live video from standard video and audio inputs.
There is an undocumented method on the GMFBridgeController filter named SetLiveTiming(). From the name, I figured that this should be set to true if we are capturing from a Live graph (not from a file) as is my case. I set this value to true and everything worked as expected
The same capture hardware allows me to capture live TV signals (ATSC in my case), so I created a new version of the graph using the BDA architecture filters, for tuning purposes. Once the data flows out from the MPEG demuxer, the rest of the graph is virtually the same as my original graph.
However, on this ocassion my muxing graph (on the other side of the bridge) was not working. Data flowed from the BridgeSource filter (video and audio) and reached an MP4 muxer filter, however no data was flowing from the muxer output feeding a FileWriter filter.
After several hours I traced the problem to the SetLiveTiming() setting. I turned it off and everything began working as expected. and the muxer filter began producing an output file, however, the audio was not synchronized to the video.
Can someone enlighten me on the real purpose of the SetLiveTiming() setting and perhaps, why one graph works with the setting enabled, while the other fails?
UPDATE
I managed to compile the GMFBridge Project, and it seems that the filter is dropping every received sample because of a negative timestamp computation. However I am completely baffled at the results I am seeing after enabling the filter log.
UPDATE 2: The dropped samples were introduced by the way I launched the secondary (muxer) graph. I inspected a sample using a SampleGrabber (thus inside a streaming thread) as a trigger-point and used a Task.Run() .NET call to instantiate the muxer graph. This somehow messed up the clocks and I ended having a 'reference start point' in the future - when the bridge attempted to fix the timestamp by subtracting the reference start point, it produced a negative timestamp - once I corrected this and spawned the graph from the application thread (by posting a graph event), the problem was fixed.
Unfortunately, my multiplexed video (regardless of the SetLiveTiming() setting) is still out of sync.
I read that the GMFBridge filter can have trouble when the InfTee filter is being used, however, I think that my graph shouldn't have this problem, as no instance of the InfTee filter is directly connected to the bridge sink.
Here is my current source graph:
-->[TIF]
|
[NetworkProvider]-->[DigitalTuner]-->[DigitalCapture]-->[demux]--|-->[Mpeg Tables]
|
|-->[lavAudioDec]-->[tee]-->[audioConvert]-->[sampleGrabber]-->[NULL]
| |
| |
| ->[aacEncoder]----------------
| |--->[*Bridge Sink*]
-->[VideoDecoder]-->[sampleGrabber]-->[x264Enc]--------
Here is my muxer graph:
video
... |bridge source|-------->[MP4 muxer]--->[fileWriter]
| ^
| audio |
---------------------
All the sample grabbers in the graph are read-only. If I mux the output file without bridging (by placing the muxer on the capture graph), the output file remains in sync, (this ended being not true, the out-of-sync problem was introduced by a latency setting in the H264 encoder) but then I can't avoid losing some seconds between releasing the current capture graph, and running the new one (with the updated file name)
UPDATE 3:
The out of sync problem was inadvertently introduced by me several days ago, when I switched off a "Zero-latency" setting in the x264vfw encoder. I hadn't noticed that this setting had desynchronized my already-working graphs too and I was blaming the bridge filter.
In summary, I screwed up things by:
Launching the muxer graph from a thread other than the Application
thread (the thread processing the graph's event loop).
A latency switch in an upstream filter that was probably delaying
things too much for the muxer to be able to keep-up.
Author's comment:
// using this option, you can share a common clock
// and avoid any time mapping (essential if audio is in mux graph)
[id(13), helpstring("Live Timing option")]
HRESULT SetLiveTiming([in] BOOL bIsLiveTiming);
The method enables a special mode of operation which addresses live data. In this mode sample times are converted between the graphs as relative to respective clock start times. Otherwise, the default mode is to expect reset of time stamps to zero with graph changes.

Best approach for transfering large data chunks over BLE

I'm new to BLE and hope you will be able to point me towards the right implementation approach.
I'm working on an application in which the peripheral (battery operated) device continuously aggregate sensor readings.
On the mobile side application there will be a "sync" button, upon button press, I would like to transfer all the sensor readings that were accumulated in the peripheral to the mobile application.
The maximal duration between sync's can be several days, hence, the accumulated data can reach a size of 20Kbytes.
Now, I'm wondering what will be the best approach to perform the data transfer from the peripheral to the central application.
I thought about creating an array of characteristics where each characteristic will contain a fixed amount of samples (e.g. representing 1hour of readings).
Then, upon sync, I will:
Read the characteristics count (how many 1hours cells).
Then read the characteristics (1hour cells) one by one.
However, I have no idea if this is a valid approach ?
I'm not sure if this is the most "power efficient" way that I can
use.
I'm not sure if Characteristic READ is the way to go, or maybe
I need to use indication instead.
Any help here will be highly appreciated :)
Thanks in advance, Moti.
I would simply use notifications.
Use one characteristic which you write something to in order to trigger the transfer start.
Then have another characteristic which you simply stream data over by sending 20 bytes at a time. Most SDKs for BLE system-on-a-chips have some way to control the flow of data so you don't send too fast. Normally by having a callback triggered when it is ready to take the next notification.
In order to know the size of the data being sent, you can for example let the first notification contain the size, and rest of them the data.
This is the most time and power efficient way since there can be sent many notifications per connection interval, compared if you do a lot of reads instead which normally requires two round trips each. Don't use indications since they also require basically two round trips per indication. They're also quite useless anyway.
You could possibly increase the speed also by some % by exchanging a larger MTU (which leads to lower L2CAP/ATT headers overhead).

Oscilloscope type design with FPGA PL and PS framebuffer interface?

I am generating a certain signal (digital pulse) in one of my verilog module running on programmable logic in Xilinx Zynq chip. Signal is pretty fast, with clock of about 200MHz.
I also have a simple linux and framebuffer Qt interface running for later controlling my application.
How can I sample my signal in order to make oscilloscope like interface inside my Qt app? I want to be able to provide visual of the pulse I am generating.
What do I need to use to be able to sample enough data at such clock frequency? And how do I pass it with kernel module or mmap to Qt?
You would do best to do what most oscilloscopes do: sample the data to RAM, and only then transfer it to the processor for display/analaysis, at a more "relaxed" pace.
On the FPGA side you will need a state machine that detects some sort of start or trigger condition, probably after a bit in a mode register is set from the software side to arm it.
The state machine will then fill samples into a buffer made of one or more block rams. If you want to placing the trigger somewhere in the middle of the samples captured, you should it as a circular buffer, and have it record continuously, stopping configurable number of samples after the trigger, so that some desired number from before the trigger condition remain un-overwritten by newer ones following it.
Since FPGA block rams are typically dual port, you can simply hook the other port up to your CPU bus for readout. You will probably want a register to read the state of the sampling state machine, and if you go with the circular buffer approach, the address where it stopped, so that you can unwrap the data to a linear record of time.
Trying to do streaming realtime sampling might be possible, but would be a lot harder and it is not clear that you could do anything meaningful with the data so produced in real time. Still, if you want to try you would probably need to put a FIFO buffer in between the sampling and the processor bus, as you will probably only be able to consume data in chunks, while having to service other operations in between, so something is needed to absorb the constant-rate inflow of samples. Another approach could be to try to build a DMA engine which would write samples directly to external system ram, but that will likely be even harder.
You could also see if there are any high speed interfaces available in the CPU which you could leverage - they might be things originally intended for video, for example.
It also appears that you are measuring only a digital signal, ie, probalby one bit. If you want to handle a higher input sample rate than the FPGA fabric can support, that could mean you could potentially use something like a deserializer block at the edge of the FPGA to turn the 1-bit input stream into a slower stream of wider samples to store.
In terms of output, once you have a vector of samples in a buffer it's pretty simple to turn that into a scope/logic analyzer type plot, with as much zooming, cursor annotation, automatic measurement or whatever you like.
Also don't forget that if the intent is only to use this during development, FPGAs and their tools often have the ability to build a logic analyzer right into the design, with the data claimed over the programming interface for plotting on a PC.

Muxing non-synchronised streams to Haali

I have 2 input streams of data that are being passed to a Haali Muxer (mp4 format).
Currently I stream these to Haali directly in a DirectShow graph without a clock. I wondered if I should be trying to write these to the muxer synchronised, or whether it happily accepts a stream of audio data that stops before the video data stream stops. (I have issues with the output file not playing audio after seeking, and I'm not sure why this could occur)
I can't find much in the way of documentation for muxing with the Haali muxer, does anyone know the best place to look for info on this filter?
To have the streams multiplexed into single MP4 file you need single instance of multiplexer (Haali, GDCL, commercial, wrapper over mp4v2 library, over Media Foundation sink etc) with two (or more) input pins on it connected to respective sources, which in turn are going to be written as tracks.
Filter graph clock does not matter. Clock is for presentation, and file writers accept incoming data and write it as soon as possible anyway. It is more accurate to remove the clock, as you seem to already be doing, but having standard clock is not going to be different.
Data is synchronized using time stamps on individual media samples, parts of media streams. Multiplexer builds internal queues for every stream and then consumes data from the streams to build single file, in a sort of way that original stream data is interleaved. If one stream supplies too much data, that is, if data is available too early while another stream supplies data slowly, multiplexer blocks further data reception on this particular stream by not returning from respective processing call (IPin::Receive) expecting that during this wait the slow stream provides additional input. Eventually, what multiplexer looks at when matching data from different streams is data time stamps.
To obtain synchronized data in resulting MP4 file you, thus, need to make sure the payload data is properly time stamped. Multiplexer will take care of the rest.
This also includes that the time stamps should be monotonously increasing within a stream, and key frames/splice points are respectively indicated. Otherwise some multiplexers might issue a failure immediately, other would produce the output file but it might have playback issues (esp. seeking).

SampleGrabber callback is getting called spuriously even when graph is paused

I'm using Directshow SampleGrabber in callback mode to capture video frame from source file and do some processing. Also I would like to maintain the current playback rate of video and need to support both random, forward and backward seeking. For this I'm also doing some local buffering in a different thread.
I'm running graph with syn source set to NULL, so as to get maximum speed. However when I pause the graph after fixed amount of buffering. The SampleGrabber callback is getting called spuriously even when graph is paused. This is affecting my frame indexing and tracking. I want to resume the graph exactly from the same position at which it was paused. However if I run the graph with default clock it works fine but then my playback get affected. I want buffering thread to finish as soon as possible.
How can I make sure that callback is not called when graph is paused? Any thoughts or suggestion would be of great help.
Thanks in advance
Pradeep
Paused graph typically has all the same streaming internally (active state) with the exception that renderers are blocking streaming, esp. as soon as enough data is received for a preview banner. Since you removed clock from the graph, your renderer is likely to not block execution because it does not hold any clock to pause against. In your case this is the problem coming out of your intent to reuse the same graph for quick parsing through file and playback. Separate graph design looks having more chances to do better.

Resources