I'm trying to understand how I can generate a waveform from an audio (or video) file to display to the user.
I've been googling around for quite a while now and can't determine if this is even possible in Qt without using something like FFmpeg. I've seen all of these classes: QMediaPlayer, QMediaContent, QMediaResource, QAudioProbe and experimented with the Qt Media Player Example but am just not seeing where I can access the actual audio buffer.
So I have 2 questions:
Is what I want to do even possible without 3rd party libraries?
If it is possible, can some kind soul outline what I need to read and understand in order to access the audio data
I have tried the suggestions from this question (Audio visualization with QMediaPlayer) but the result of audioProbe->setSource(player) is always false and the method processBuffer never gets called.
audioProbe = new QAudioProbe(this);
bool success = audioProbe->setSource(player);
qDebug() << success;
connect(audioProbe, SIGNAL(audioBufferProbed(QAudioBuffer)), this, SLOT(processBuffer(QAudioBuffer)));
Update: Adding some additional detail in the hope of clarifying things.
For testing/learning I am using the Media Player Example which ships with Qt, so it is set up correctly with Q_OBJECT etc.
For audio, I tested with both .mp3 and .wav files. FWIW, the player example won't play video for some reason (.mp4, .avi were tested)
The player in the code is QMediaPlayer – which inherits from QMediaObject. The example code for the Player class is here. I added my code (in original comment above) right after the player is instantiated. I also tried adding it once media is loaded.
I tried declaring my slot first as private, then as public – either way, it is never called.
Frustrating that such a simple thing is so hard.
Going the "no external library" route will likely just lead to more of a headache and more work than is necessary. The other advantage of going with an established library is you won't be bound to one file format, as not all formats store their data the same way. If the audio format is uncompressed (wav or other) you can read the header until you get to the data chunk. An answer to this question here details this in C. You should be able to get an idea for the file format from this to apply it to another language.
You will want to understand how many channels are in the wav file, bit depth, and also the sampling rate before you can do anything worthwhile with the data. All this info can be grabbed from the header.
It turns out that QAudioProbe is not supported on OSX – the platform I am working on. Took quite a while (a "Qt while. . .") to ferret that info out so I am posting it here explicitly.
See this document for full details: Qt 5.5.0 Multimedia Backends
Related
So I have ffmpeg writing its progress to a text file, and I need to read the new values (lines) from the said file. How should I approach this using Qt classes in order to minimize the amount of code I have to write?
I don't even have an idea where to start, other than doing ugly things like seeking to the end, storing this pos, then seeking to the end again a bit later and comparing the new pos to the previous one. It's unclear to me if QTextStream can be used here or not, for instance.
I used Win32 API own interface for the file system notification some time ago and that worked 100% reliably. Modern OSes provide us with notifications for the file change. And Qt incorporates such functionality as well. Specifically for the purpose of tracking the file changes I would use QFileSystem::fileChanged signal to start the slot myFileReadNextBuffer() method only in case if the file was changed. But then you would still want to evaluate how many bytes were added by subtracting the previous from the new file length. And there is also relative question here: How to know when and which files are changed in windows filesystem with winapi.
If the file is only growing:
Whether the file is text-based or not I would open it in shared mode and read to the end and read more till the end when the notification received.
I've been doing some work in VB.Net with Directshow over the past 3-4 weeks. I'm creating an application to keep tags on a video and eventually want to be able to extract the tagged parts of the video to a new file. In a video that is 2 hours long I might want to extract say 50 10-15 second "clips" up to 15 times (event tagging). This will be for a free application.
I've found it brilliant (and easy) to render / seek / play clips, etc on XP-Win7 with no issues. I've "discovered" the joys of GraphEdit, creating graphs, the issues with COM in VB.NET, GMFBridge, ....etc.
Now I need some advice. Am I using the right technology. Directshow seems to be very resistant to the idea of "open video", "seek to clip", "write clip to file", .....repeat for all clips, close file. I can sort of do this already if I visibly render the video but would need to do it as a background task faster than realtime render speed.
Things that seem to be missing are:
- an example of anyone doing anything similar (export multiple clips to a single file)
- no easily available 64bit compressors (lots of 32bit stuff around)
- all the references and examples I do find are VERY old
- VB.NET is not the first "port of call" for DirectShow developers
So, the question is, should I be using something else?
If not, has anyone done anything similar before. I'm not looking for their code, I just want some guidelines as it takes ages to figure things out in DirectShow and VB.Net just using trial & error (and Google).
I've looked at AFORGE (no sound), FFMPEG (command line toolset), Media Foundation (reluctant to throw away XP) and a variety of commercial helper libraries but not really getting any further.
Apologies for the length but I wanted readers to understand the background.
All help appreciated.
To output clips to a single file Microsoft had created the "DirectShow Editing Services". Sometimes it works, sometimes not. We use it in our software to create videos from clips like you. With a little bit work you can also include effects to the video.
It is also possible to use AviSynth. It's a scripting system and frameserver for DirectShow.
As I know, with MediaFoundation you can also create a video from multiple clips, but I never tried this.
I´m working on an application based on directshow that has to convert an AVI source file to to an mp4-file that can be played back with Quicktime.
Since 3ivx, according to my web research the most popular way to fulfill this task, has become commercial (and my budget is quite limited), I decided to use a solution based on ffdshow.
I created a simple graph in graphedit, using LAME for audio encoding and GDCL MPEG 4 Multiplexor for the muxing, but everytime I try to play the movie with Quicktime, I´m getting an error indicating a wrong "sample description".
Playback with Windows Media Player is working, except that there is no sound.
My guess is that there´s a problem with the muxer, because every time I try to add audio encoding, graphedit automatically adds an decoder after the encoding unit (see picture link).
http://imageshack.us/photo/my-images/39/graphjrgr.png/
Any ideas on how to integrate ffdshow in a better way, tips for alternative mp4 muxers, or a complete different approach are appreciated!
The GDCL muxer has limited number of audio formats that it supports, probably you should check the source code for the muxer to see if the formats you are using are in fact supported. Basically, you need to choose an audio encoder that the mux recognizes as valid. It might be possible to use GraphEdit to choose different properties for the encoder filter that allow things to work better.
I have had some luck with the Monogram x264(video) and AAC(audio) encoders. See http://blog.monogram.sk/janos/directshow-filters/
Finally, try the debug version of the GDCL mp4 muxer.
Also, you must be aware of MPEG-4 LA licensing requirements for x264 http://www.mpegla.com/main/programs/AVC/Pages/FAQ.aspx
Having had a quick look at the Flex docs I can't seem to find any reference to providing audio content to be played from a custom (possibly encrypted - don't worry, it's not that evil) container format. Is this possible and if so, could someone point me in the right direction.
Or if that's not possible, some way to hook into the disk/network (disk is much more important in this case) I/O of the sound playing mechanism to provide a supported container in memory from a custom wrapper.
Since Flash Player 10, it's posible to write PCM / raw audio data to a Sound Object.
Basically, you call play on an "empty" Sound Object and it will start dispatching periodically a SampleDataEvent, requesting data. You then can write to the audio stream through the data ByteArray exposed by the event object.
http://help.adobe.com/en_US/FlashPlatform//reference/actionscript/3/flash/events/SampleDataEvent.html?filter_flex=4
http://www.adobe.com/devnet/flash/articles/dynamic_sound_generation/index.html
Also, if you're interested in good articles and reference for audio programming in Actionscript, you might want to check out Andre Michelle's stuf:
http://blog.andre-michelle.com/
http://lab.andre-michelle.com/
A flash.media.Sound must either be:
constructed/loaded with a URLRequest,
inherit its data through embedding
There currently is no provision for directly piping mp3 (or aac, or video) data to a any "media" object, such as Sound. You can only get the Sound object to download the data for itself. There are people who are upset about this, including myself; you are not alone!
I say "at this stage" because it's not unthinkable that Adobe will update the API to make this possible in a future version. For the now, you're best to go with the decoding-to-a-dynamic-sound workaround mentioned by Juan, if you really need to be able to do this.
And post a feature request at Adobe's bug tracker, or vote on an existing one!
I have a legacy file format that contains sounds embedded in it (in various encodings). I would like to be able to play these sounds in Flash (Air?) by reading the sound bytes out of the file and instantiating a Sound object with them.
If the sound is unencoded (e.g., raw pcm), I've found that I can use the new flex 4 SampleDataEvent.SAMPLE_DATA event to play the sound.
However, if the sound is encoded (e.g., mp3), then I'm at a loss. The sound expected by SampleDataEvent.SAMPLE_DATA has to be raw pcm. From what I've seen, encoded Sounds can only be instantiated by [Embed]ing them, or by using a URLRequest with Sound.load().
Surely there's a third way? AMF or e4x?
There are really only two routes for you to go. The first is to write a decoder in ActionScript. You may be able to use Alchemy to port over some C/C++ code to make this job significantly easier (and possibly more performant). This is exactly how I got Ogg Vorbis playback to work with Flash.
The other option is to dynamically create a valid SWF inside of a ByteArray. That SWF could contain an embedded sound object that was made up of your sound data. A number of folks have pulled off similar hacks in the past before Flash Player 10 was available. I believe you can find a good place to start in Andre Michelle's and Joa Ebert's PopForge codebase.