How to process RTP data for Microsoft directshow MPEG1 decoder - directshow

Starting from videoprocessing project, I'm trying to build a directshow filter that connects to a RTSP server becoming a source filter for the Windows MPEG1 decoder (I can not use other formats or decoders having WinCE as OS target).
My filter declares MediaType
MEDIATYPE_Video type
FORMAT_MPEGVideo subtype
MEDIASUBTYPE_MPEG1Payload formatType
Currently, when I connect my rtspSource filter with the CLSID_CMpegVideoCodec decoder, I am rendering a black video.
However, if I replace the windows decoder with CLSID_LAV_VideoDecoderFilter provided by the LAVFilters project, the video is correctly rendered.
After reading "How to process raw UDP packets so that they can be decoded by a decoder filter in a directshow source filter", dealing with the same issue for H264 and MPEG-4, I also read the RFC2250 and then I have depacketized the data but the result is the same.
Currently I'm sending to decoder packets starting with Video Stream Start Code
000001 00 (Picture)
or integral packets starting with
000001 B3 (Sequence Header)
and which contain within them also startCode
000001 B2 (User Data)
000001 B8 (Group Of Picture)
000001 00 (Picture)
000001 01 (Slice)
Still referring to the previous link, which deals with H264 and MPEG-4 cases, speak about "Process data for decoder" but I am not clear exactly what is expected by the CLSID_CMpegVideoCodec filter, after agreeing the format type MEDIASUBTYPE_MPEG1Payload.
However, adding at the beginning of each sample the three bytes 000001 or the 4 bytes 00000100, the video is rendered with images updated approximately every 2 seconds and losing the intermediate images.
I performed the tests both by setting the IMediaSample with
SetTime(NULL, NULL)
that setting
SetTime(start, start+1)
with:
start = (rtp_timestamp - rtp_timestamp_first_packet) + 300ms
following the answer to "Writing custom DirectShow RTSP/RTP Source push filter - timestamping data coming from live sources"
but the results do not change.
Any suggestions would be greatly appreciated.
Thanks in advance.

Related

GMFBridge DirectShow filter SetLiveTiming effect

I am using the excellent GMFBridge directshow family of filters to great effect, allowing me to close a video recording graph and open a new one, with no data-loss.
My original source graph was capturing live video from standard video and audio inputs.
There is an undocumented method on the GMFBridgeController filter named SetLiveTiming(). From the name, I figured that this should be set to true if we are capturing from a Live graph (not from a file) as is my case. I set this value to true and everything worked as expected
The same capture hardware allows me to capture live TV signals (ATSC in my case), so I created a new version of the graph using the BDA architecture filters, for tuning purposes. Once the data flows out from the MPEG demuxer, the rest of the graph is virtually the same as my original graph.
However, on this ocassion my muxing graph (on the other side of the bridge) was not working. Data flowed from the BridgeSource filter (video and audio) and reached an MP4 muxer filter, however no data was flowing from the muxer output feeding a FileWriter filter.
After several hours I traced the problem to the SetLiveTiming() setting. I turned it off and everything began working as expected. and the muxer filter began producing an output file, however, the audio was not synchronized to the video.
Can someone enlighten me on the real purpose of the SetLiveTiming() setting and perhaps, why one graph works with the setting enabled, while the other fails?
UPDATE
I managed to compile the GMFBridge Project, and it seems that the filter is dropping every received sample because of a negative timestamp computation. However I am completely baffled at the results I am seeing after enabling the filter log.
UPDATE 2: The dropped samples were introduced by the way I launched the secondary (muxer) graph. I inspected a sample using a SampleGrabber (thus inside a streaming thread) as a trigger-point and used a Task.Run() .NET call to instantiate the muxer graph. This somehow messed up the clocks and I ended having a 'reference start point' in the future - when the bridge attempted to fix the timestamp by subtracting the reference start point, it produced a negative timestamp - once I corrected this and spawned the graph from the application thread (by posting a graph event), the problem was fixed.
Unfortunately, my multiplexed video (regardless of the SetLiveTiming() setting) is still out of sync.
I read that the GMFBridge filter can have trouble when the InfTee filter is being used, however, I think that my graph shouldn't have this problem, as no instance of the InfTee filter is directly connected to the bridge sink.
Here is my current source graph:
-->[TIF]
|
[NetworkProvider]-->[DigitalTuner]-->[DigitalCapture]-->[demux]--|-->[Mpeg Tables]
|
|-->[lavAudioDec]-->[tee]-->[audioConvert]-->[sampleGrabber]-->[NULL]
| |
| |
| ->[aacEncoder]----------------
| |--->[*Bridge Sink*]
-->[VideoDecoder]-->[sampleGrabber]-->[x264Enc]--------
Here is my muxer graph:
video
... |bridge source|-------->[MP4 muxer]--->[fileWriter]
| ^
| audio |
---------------------
All the sample grabbers in the graph are read-only. If I mux the output file without bridging (by placing the muxer on the capture graph), the output file remains in sync, (this ended being not true, the out-of-sync problem was introduced by a latency setting in the H264 encoder) but then I can't avoid losing some seconds between releasing the current capture graph, and running the new one (with the updated file name)
UPDATE 3:
The out of sync problem was inadvertently introduced by me several days ago, when I switched off a "Zero-latency" setting in the x264vfw encoder. I hadn't noticed that this setting had desynchronized my already-working graphs too and I was blaming the bridge filter.
In summary, I screwed up things by:
Launching the muxer graph from a thread other than the Application
thread (the thread processing the graph's event loop).
A latency switch in an upstream filter that was probably delaying
things too much for the muxer to be able to keep-up.
Author's comment:
// using this option, you can share a common clock
// and avoid any time mapping (essential if audio is in mux graph)
[id(13), helpstring("Live Timing option")]
HRESULT SetLiveTiming([in] BOOL bIsLiveTiming);
The method enables a special mode of operation which addresses live data. In this mode sample times are converted between the graphs as relative to respective clock start times. Otherwise, the default mode is to expect reset of time stamps to zero with graph changes.

GnuRadio tcp_sink data values are garbled

I'm developing a web front end for a GNU Radio application developed by a colleague.
I have a TCP client connecting to the output of two TCP Sink blocks, and the data encoding is not as I expect it to be.
One TCP Sink is sending complex data and the other is sending float data.
I'm decoding the data at the client by reading each 4-byte chunk as a float32 value. The server and the client are both little-endian systems, but I also tried byte swapping (with the GNU Radio Endian Swap block and also manually at the client), and the data is still not right. Actually it's much worse then, confirming there is no byte order mismatch.
When I execute the flow graph in GNU Radio Companion with appropriate GUI elements, the plots look correct. The data values are shown as expected to between 0 and 10.
However the values decoded at the client are generally around 0.00xxxxx, and the plot looks like noise rather than showing a simple tone as is seen in GNU Radio. If I manually scale the data by multiplying by 1000 it still looks like noise.
I'll describe the pre-D path in GNU Radio since it's shorter, but I see the same problem on the post-D path, where a WBFM Receive and a Rational Resampler are added, followed by a Throttle block and then a TCP Sink block sending float data.
File Source (Output Type: complex, vector length: 1) =>
Throttle (vector length: 1) =>
Low Pass Filter (FIR Type: Complex->Complex (Decimating)) =>
Throttle (vector length: 1) =>
TCP Sink (input type: complex, vector length: 1).
This seems to be the correct way to specify the stream parameters (and indeed Companion shows errors if I make changes which mismatch the stream items), but I can find no way to decode the data correctly on the other end of the stream.
"the historic RFC 1700 (also known as Internet standard STD 2) has defined the network order for protocols in the Internet protocol suite to be big-endian , hence the use of the term 'network byte order' for big-endian byte order."
see https://en.wikipedia.org/wiki/Endianness
having mentioned the network order for protocols being big-endian, this actually says nothing about the byte order of network payload itself.
also note: Sun Microsystems made big-endian native byte order computers (upon which much Internet protocol development was done).
i am surprised the previous answer has gone this long without a lesson on network byte order versus native byte order.
GNURadio appears to assume native byte order from a UDP Source block.
Examining the datatype color codes in Help->Types of GNURadio Companion, the orange colored 'float' connections are float32.
To verify a computer's native byte order, in Python, do:
from sys import byteorder
byteorder
the result will be 'little' or 'big'
It might be possible that no matter what type floats you are sending, when bytes get on network they get ordered in little endian. I had similar problem with udp connection, and I solved it by parsing floats as little endian on client side.

v4l2 -> QByteArray(?) -> QWebsocket -> internet -> {PC, Android, web}

as you can assume from the tile, I would like to broadcast a webcam stream to different clients. I know that there are many solutions (as motion), but I have already a working infrastructure based on a Qt server software and a websocket as connection to the outer world.
I have read source code of other linux applications like kopete and motion to find out the most efficient way, but don't come to a good conclusion. Another goal is to keep the websocket stream in a format which can be decoded by e.g. javascript in a browser.
The source, a v4l2 device, is already accessed. There a different formats (YUV, MJPEG, ...) but I don't know which (standard) format to choose when it comes to streaming. Another requirement is to save the stream to a harddrive and to process those stream (opencv?) to find motion. So the question is should i transmit a QByteArray thats zlib compressed or use mjpeg, which I don't know how to use. The used webcam is a uvcvideo device:
enter ioctl: VIDIOC_ENUM_FMT
Index : 0
Type : Video Capture
Pixel Format: 'MJPG' (compressed)
Name : MJPEG
Index : 1
Type : Video Capture
Pixel Format: 'YUYV'
Name : YUV 4:2:2 (YUYV)code
To be honest, I am not sure how motion does this in detail, because this might be the ways to choose.
Thanks
small

GDCL Mpeg-4 Multiplexor Problem

I just create a simple graph
SourceFilter(*.mp4 file format) ---> GDCL MPEG 4 Mux Filter ---> File writer Filter
It works fine. But when the source is in h264 file format
SourceFilter( *.h264 file format) ---> GDCL MPEG 4 Mux Filter---> File writer Filter
It record a file but the recorded file does not play in VLC Player, QuickTime, BS Player, WM Player.
What i am doing wrong? Any ideas to record h264 video source? Do i need H264 Mux?
Best Wishes
PS: i JUST want to record video by the way...Why i need a mux?
There are two H.264 formats used by DirectShow filters. One is Byte Stream Format, in which each NALU is preceded by a start code 00 00 01. The other is the format used within MP4 files, in which each start code is preceded by a length (the media type or the MP4 file metadata specifies how many bytes are used in the length field). The problem is that some FOURCCs are used for both formats.
The MP4 mux sample accepts either BSF or length-preceded data, depending on the subtype give. It does not attempt to work out which it is. Most likely, when you are feeding the H.264 elementary stream, you are giving the mux a FOURCC or media type that the mux thinks means length-prepended, when you are giving BSF data. Check in TypeHandler::CanSupport.
If you just want to save H.264 video to a file, you can use a Dump filter to just write the bits to a file. If you are saving BSF, this is a valid H.264 elementary stream file. If you want support for the majority of players, or if you want seeking support, then you will want to write the elementary stream into a container with an index, such as MP4. In this case, you need a mux, not for the multiplexing, but for the indexing and metadata creation.
G

Extracting SMPTE timecode from audio stream

I'm working on an audio recording system. My task involves extracting the SMPTE time code from audio input stream, generated by a synchronizer device. I'm using ASIO SDK to get the time code of each callback buffer but it's always zero.
Perhaps somebody has experience in ASIO SDK (or any other platform/sdk that can be used to extract SMPTE timecode from audio stream) could help me?
Regards,
Ben
LTC is straightforward, so If nothing else, you could just scan the audio stream for LTC data, as documented on wikipedia. Each 80 bit frame ends with 0011 1111 1111 1101, just scan for that byte sequence to synchronize, then cast the buffer data starting after that sync sequence to be an array of 80 bit struct timecode_t elements. If your buffer is sized as a multiple of 80 your calculations will be easier (but you do need to test for sync lossage, because soundcards lose bits with overruns).
The hard part is that if I am not mistaken, the time code "bits" are not the same of the bits of the sampled audio stream, so you would have to implement logic to detect the bit sequence. This can just be a for loop checking for the proper signal changes and appending bits to the buffer as appropriate (and then calling the function to interpret the buffer when it is full).

Resources