Video Processor MFT and deinterlacing - ms-media-foundation

MSDN Video Processor MFT mentions that the MFT can be used to deinterlace interlaced video.
I set the output Media type to the same as the input + the MF_MT_INTERLACE_MODE to progressive on the output media type.
But the output samples are still interleaved.

I can't test the Video Proccessor MFT because it needs Windows8/10. But i will say two things :
The documentation says it is GPU accelerated, but does not say if it does fallback to software processing. So, if it's only GPU accelerated, and if your GPU does not support deinterlacing, it can explain that your frames are still interleaved. You can check DXVAHD_PROCESSOR_CAPS.
For a correct deinterlacing, sample needs to be assigned with some of those values : MFSampleExtension_Interlaced, MFSampleExtension_BottomFieldFirst, MFSampleExtension_RepeatFirstField , and so on (Sample Attributes). So you can check if parser/decoder correctly set those values. If it is not, the Video Processor MFT will not be able to do deinterlacing.

Related

AMDh264Encoder returning MF_E_ATTRIBUTENOTFOUND when checking MFSampleExtension_CleanPoint

When receiving an output from IMFTransform::ProcessOutput, calling GetUINT32(MFSampleExtension_CleanPoint) on the sample fails and returns MF_E_ATTRIBUTENOTFOUND only while using the AMDh264Encoder (NV12 in, H264 out), as such there are no keyframes in the final output video so it is corrupted.
What causes getting attribute MFSampleExtension_CleanPoint MF_E_ATTRIBUTENOTFOUND to fail, only on the AMDh264Encoder?
Video encoder MFTs are supplied by hardware vendors. AMD does "AMDh264Encoder" for their hardware and introduces it with their video drivers, in particular.
For this reason implementations from different vendors have slight differences, AMD guys decided to not set this attributes on produced media samples.
You should skip this gracefully and treat the attribute as optional.

Why does GetDeliveryBuffer blocked with an INTERLEAVE_CAPTURE mode AVI Mux?

I'm trying to use a customized filter to receive video and audio data from a RTSP stream, and deliver samples downstream the graph.
It seems like that this filter was modified from the SDK source.cpp sample (CSource), and implemented two output pins for audio and video.
When the filter is directly connected to an avi mux filter with INTERLEAVE_NONE mode, it works fine.
However, when the interleave mode of avi mux is set to INTERLEAVE_CAPTURE,
the video output pin will hang on the GetDeliveryBuffer method (in DoBufferProcessingLoop) of this filter after several samples have sent,
while the audio output pin still works well.
Moreover, when I inserted an infinite pin tee filter into one of the paths between the avi mux and this source filter,
the graph arbitrarily turned into stop state after some samples had been sent (one to three samples or the kind).
And when I put a filter that is just an empty trans-in-place filter which does nothing after the infinite tee,
the graph went back to the first case: never turns to stop state, but hang on the GetDeliveryBuffer.
(Here is an image that shows the connections I've mentioned like)
So here are my questions:
1: What could be the reasons that the video output pin hanged on the GetDeliveryBuffer ?
In my guess it looks like the avi mux caught these sample buffers and did not release them until they are enough for interleaving,
but even when I set the amount of video buffers to 30 in DecideBufferSize it will still hang. If the reason is indeed like that, so how do I decide the buffer size of the pin for a downstream avi muxer ?
Likely a creation of more than 50 buffers of a video pin is not guaranteed to work because the memory size cannot be promised. :(
2: Why does the graph goes to stop state when the infinite pin tee is inserted ? And why could a no-operation filter overcomes it ?
Any answer or suggestion is appreciated. Or hope someone just give me some directions. Thanks.
Blocked GetDeliveryBuffer means the allocator, you are requesting a buffer from, does not [yet] have anything for you. All media samples are outstanding and are not yet returned back to the allocator.
An obvious work around is to request more buffers at pin connection and memory allocator negotiation stage. This however just postpones the issue, which can very much similarly appear later for the same reason.
A typical issue with a topology in question is related to threading. Multiplexer filter which has two inputs will have to match input streams to produce a joint file. Quite so often on runtime it will be holding media samples on one leg while expecting more media samples to come on the other leg on another thread. It is assumes that upstream branches providing media samples are running independently so that a lock on one leg is not locking the other. This is why multiplexer can freely both block IMemInputPin::Receive methods and/old hold media samples inside. In the topology above it is not clear how exactly source filter is doing threading. The fact that it has two pins make me assume it might have threading issues and it is not taking into account that there might be a lock downstream on multiplexer.
Supposedly source filter is yours and you have source code for it. You are interested in making sure audio pin is sending media samples on a separate thread, such as through asynchronous queue.

Streaming Video On Lossy Network

Currently I have a GStreamer stream being sent over a wireless network. I have a hardware encoder that coverts raw, uncompressed video into a MPEG2 Transport Stream with h.264 encoding. From there, I pass the data to a GStreamer pipeline that sends the stream out over RTP. Everything works and I'm seeing video, however I was wondering if there was a way to limit the effects of packet loss by tuning certain parameters on the encoder.
The two main parameters I'm looking at are the GOP Size and the I frame rate. Both are summarized in the documentation for the encoder (a Sensoray 2253) as follows:
V4L2_CID_MPEG_VIDEO_GOP_SIZE:
Integer range 0 to 30. The default setting of 0 means to use the codec default
GOP size. Capture only.
V4L2_CID_MPEG_VIDEO_H264_I_PERIOD:
Integer range 0 to 100. Only for H.264 encoding. Default setting of 0 will
encode first frame as IDR only, otherwise encode IDR at first frame of
every Nth GOP.
Basically, I'm trying to give the decoder as good of a chance as possible to create a smooth video playback, even given the fact that the network may drop packets. Will increasing the I frame rate do this? Namely, since the I frame doesn't have data relative to previous or future packets, will sending the "full" image help? What would be the "ideal" setting for the two above parameters given the fact that the data is being sent across a lossy network? Note that I can accept a slight (~10%) increase in bandwidth if it means the video is smoother than it is now.
I also understand that this is highly decoder dependent, so for the sake of argument let's say that my main decoder on the client side is VLC.
Thanks in advance for all the help.
Increasing the number of I-Frames will help the decoder recover quicker. You may also want to look at limiting the bandwidth of the stream since its going to be more likely to get the data through. You'll need to watch the data size though because your video quality can suffer greatly since I-Frames are considerably larger than P or B frames and the encoder will continue to target the specified bitrate.
If you had some control over playback (even locally capturing the stream and retransmitting to VLC) you could add FEC which would correct lost packets.

How is the bandwidth attribute in the master m3u8 playlist determined?

I have read from many sources that BANDWIDTH is a required attribute, supposedly to be an upper bound of the actual bitrate of the video, while also allowing for "container overhead."
#EXTM3U
#EXT-X-STREAM-INF:PROGRAM-ID=1,RESOLUTION=480x270,CODECS="avc1.42001e,mp4a.40.2",BANDWIDTH=663000
test110_600_.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,RESOLUTION=640x360,CODECS="avc1.4d001f,mp4a.40.2",BANDWIDTH=1088000
test110_1m.m3u8
How is this BANDWIDTH=663000 and BANDWIDTH=1088000 determined? Or rather, how should it be determined? Test runs with the Amazon Elastic Transcoder give seemingly wild results, especially when using videos of short duration; with Amazon's services I have created playlists where the bitrate of the video rose above the BANDWIDTH specified in the m3u8 file.
The bandwidth is overall bitrate of the movie (including transmission overhead).
Usually, the bitrate is determined at the encoding/transcoding step. Non-consistent resulting bitrate issue with Amazon transcoder might be caused by incorrect options. If you need constant bitrate (more exactly, hard maximum bitrate) you must not use constant quality mode (usual default mode).
For calculating bandwidth of already encoded movie file, there are various analysis tools. You can find more information by Googling 'bitrate calculator'.

High Resolution Capture and Encoding

I'm using two custom push filters to inject audio and video (uncompressed RGB) into a DirectShow graph. I'm making a video capture application, so I'd like to encode the frames as they come in and store them in a file.
Up until now, I've used the ASF Writer to encode the input to a WMV file, but it appears the renderer is too slow to process high resolution input (such as 1920x1200x32). At least, FillBuffer() seems to only be able to process around 6-15 FPS, which obviously isn't fast enough.
I've tried increasing the cBuffers count in DecideBufferSize(), but that only pushes the problem to a later point, of course.
What are my options to speed up the process? What's the right way to do live high res encoding via DirectShow? I eventually want to end up with a WMV video, but maybe that has to be a post-processing step.
You have great answers posted here to your question: High resolution capture and encoding too slow. The task is too complex for the CPU in your system, which is just not fast enough to perform realtime video encoding in the configuration you set it to work.

Resources