after hours of searching on the net I'm quite desperate to find solution for this. I've up & running OGG Theora decoder in DirectShow which ouputs YV12 and YUY2 color models.
Now, I want to make a RGB pixel manipulation filter for this output and to process it into video renderer.
According to this and
this, it should be really easy and transparent but it isn't.
For example, I implemented in CheckInputType() this check:
if( IsEqualGUID(*mtIn->Type(), MEDIATYPE_Video )
&& IsEqualGUID(*mtIn->Subtype(), MEDIASUBTYPE_RGB565 ) )
{
return S_OK;
}
and I would expect it inserts that MSYUV between Theora and my decoder and do the job for me (i.e. convert it into RGB). The problem is I got error everytime (in GraphEdit application). And I'm 100% sure it's YV12 as input (checked in debugger). Only explanation I could think of is that mention of AVI decompressor but there's no further info about it.
Does it mean I have to use AVI container if I want to get this automatic functionality?
Strange thing is it works for example for WMV videos (with YUV on their ouput), only this OGG decoder has a problem with it. So it's probably a question what this OGG decoder miss?
Too bad that MSYUV filter doesn't work as the Color Space Converter, i.e. visible and directly usable in GraphEdit...
I appreciate any hint on this, programming own YV12 -> RGB converter I take as the last resort.
There is no YUV to RGG colorspace converter built into Directshow. The reason that WMV files are working for you is that the WMV decoder filter will spit out RGB or YUV data depending on the type of filter you connect it too.
The best you can do here is write a colorspace converter filter yourself, or just convert the YUV data after you get it.
Fourcc.org has nice article on converting from YUV to RGB. Also the book Video Demystified by Keith Jack has all the details on colorspace conversions.
Related
I'm trying to understand how I can generate a waveform from an audio (or video) file to display to the user.
I've been googling around for quite a while now and can't determine if this is even possible in Qt without using something like FFmpeg. I've seen all of these classes: QMediaPlayer, QMediaContent, QMediaResource, QAudioProbe and experimented with the Qt Media Player Example but am just not seeing where I can access the actual audio buffer.
So I have 2 questions:
Is what I want to do even possible without 3rd party libraries?
If it is possible, can some kind soul outline what I need to read and understand in order to access the audio data
I have tried the suggestions from this question (Audio visualization with QMediaPlayer) but the result of audioProbe->setSource(player) is always false and the method processBuffer never gets called.
audioProbe = new QAudioProbe(this);
bool success = audioProbe->setSource(player);
qDebug() << success;
connect(audioProbe, SIGNAL(audioBufferProbed(QAudioBuffer)), this, SLOT(processBuffer(QAudioBuffer)));
Update: Adding some additional detail in the hope of clarifying things.
For testing/learning I am using the Media Player Example which ships with Qt, so it is set up correctly with Q_OBJECT etc.
For audio, I tested with both .mp3 and .wav files. FWIW, the player example won't play video for some reason (.mp4, .avi were tested)
The player in the code is QMediaPlayer – which inherits from QMediaObject. The example code for the Player class is here. I added my code (in original comment above) right after the player is instantiated. I also tried adding it once media is loaded.
I tried declaring my slot first as private, then as public – either way, it is never called.
Frustrating that such a simple thing is so hard.
Going the "no external library" route will likely just lead to more of a headache and more work than is necessary. The other advantage of going with an established library is you won't be bound to one file format, as not all formats store their data the same way. If the audio format is uncompressed (wav or other) you can read the header until you get to the data chunk. An answer to this question here details this in C. You should be able to get an idea for the file format from this to apply it to another language.
You will want to understand how many channels are in the wav file, bit depth, and also the sampling rate before you can do anything worthwhile with the data. All this info can be grabbed from the header.
It turns out that QAudioProbe is not supported on OSX – the platform I am working on. Took quite a while (a "Qt while. . .") to ferret that info out so I am posting it here explicitly.
See this document for full details: Qt 5.5.0 Multimedia Backends
I'm trying to split a *.mov file in to raw audio an raw video. I have a DirectShow filter which is working as decoder for the video stream and Windows Media Player can actually see and use it to play this video file but I having a hard time figuring out how does it work exactly since I need to compose a complex DirectShow graph. I assumed that WMP will use WM ASF Rreader but if I try to add this filter to the graph in GraphEdit with *.mov file as parameter it's failing with 0xc00d0026 error code which makes sense since it's suppose to work with uncompressed formats only.
Which other DirectShow source filters can be used by WMP in order to split a *.mov video file in to raw video and audio?
Windows Media Player (current versions, not ancient) does not use DirectShow for MOV files. Instead, it uses Media Foundation.
FYI: 0xC00D0026 is NS_E_UNRECOGNIZED_STREAM_TYPE "The specified protocol is not recognized. Be sure that the file name and syntax, such as slashes, are correct for the protocol."
I suppose you can find suitable DirectShow components to demultiplex MOV files: Haali Media Splitter, GDCL MPEG-4 Demultiplexer are among widely used.
I´m working on an application based on directshow that has to convert an AVI source file to to an mp4-file that can be played back with Quicktime.
Since 3ivx, according to my web research the most popular way to fulfill this task, has become commercial (and my budget is quite limited), I decided to use a solution based on ffdshow.
I created a simple graph in graphedit, using LAME for audio encoding and GDCL MPEG 4 Multiplexor for the muxing, but everytime I try to play the movie with Quicktime, I´m getting an error indicating a wrong "sample description".
Playback with Windows Media Player is working, except that there is no sound.
My guess is that there´s a problem with the muxer, because every time I try to add audio encoding, graphedit automatically adds an decoder after the encoding unit (see picture link).
http://imageshack.us/photo/my-images/39/graphjrgr.png/
Any ideas on how to integrate ffdshow in a better way, tips for alternative mp4 muxers, or a complete different approach are appreciated!
The GDCL muxer has limited number of audio formats that it supports, probably you should check the source code for the muxer to see if the formats you are using are in fact supported. Basically, you need to choose an audio encoder that the mux recognizes as valid. It might be possible to use GraphEdit to choose different properties for the encoder filter that allow things to work better.
I have had some luck with the Monogram x264(video) and AAC(audio) encoders. See http://blog.monogram.sk/janos/directshow-filters/
Finally, try the debug version of the GDCL mp4 muxer.
Also, you must be aware of MPEG-4 LA licensing requirements for x264 http://www.mpegla.com/main/programs/AVC/Pages/FAQ.aspx
I would like to encode video in my app with VP8. I use RGB24 format in my app but VP8 DirectShow filter accepts only YUV format (http://www.webmproject.org/tools/#directshow_filters).
I've googled the "RGB to YUV directshow filter" but no success. I don't want to write this filter myself from scratch, so I would appreciate if you help me with the information on where to find such filter.
Thanks!
You could try Geraint Davies' YUV transform filter to see if it supports the conversion.
Starting from Vista you can use Color Converter DSP, does this help?
If you know how to implement a transform filter, I have a fast YUV to RGB algorithm somewhere. I used DirectShow a looong time ago, so I can't be of anymore help than this :P
I am working on an image processing application. I have to display an image sequence. I would like to avoid any extra overhead for {internal} format conversions.
I believe RGB should be the optimal format for display. But SDL accepts various YUV formats and there is no native{to SDL} support for RGB. Whereas Qt does not accept YUV format at all. X accepts RGBX format {native}. Images can be generated in any desired format for display. But CPU/GPU cycles for format conversion should be avoided. Any suggestion on what's the right way of displaying image sequences would be great.
The output format is ARGB. SDL works with RGB surfaces, so I don't understand your claim that "there is no native{to SDL} support for RGB.".
The native video acceleration interface of X only supports YUV input however. The YUV->RGB conversion on the GPU comes for free if you use the video acceleration interface. No "cycles" wasted here.
Perhaps you should go into more detail about your purposes. What is the framerate we are dealing with here?
I think you should use any uncompressed image + QPixmap.