I am using DirectShow in my application to capture video from webcams. I have issues while using cameras to preview and capture 1080P videos. Eg: HD Pro Webcam C910 camera of Logitech.
1080P video preview was very jerky and no HD clarity was observed. I could see that the enumerated device name was "USB Video Device"
Today we installed Logitech webcam software on these XP machines . In that application, we could see the 1080P video without any jerking. Also we recorded 1080P video in the Logitech application and saw them in high quality.
But when I test my application,
I can see that the enumerated device name has been changed to "Logitech Pro Webcam C910" instead of the "USB Video Device" as in the previous case.
The CPU eaten up by my application is 20%, but the process "SYSTEM" eats up 60%+ and the overall CPU revolves around 100%
Even though the video quality has been greatly improved, the jerks are still there, may be due to the 100% CPU.
When I closed my application, the high CPU utlizaton by "System" process goes away.
Regarding my application - It uses ICaptureGraphBuilder2::RenderStream to create Preview and Capture streams.
In Capture Stream, I connect Camera filter to NULL renderer with sample grabber as the intermediate filter.
In preview stream, I have
g_pBuild->RenderStream(&PIN_CATEGORY_PREVIEW,&MEDIATYPE_Video,cam,NULL,NULL);
Preview is displayed on a windows as specified using IVideoWindow interface. I use the following
g_vidWin->put_Owner((OAHWND)(HWND)hWnd);
g_vidWin->put_WindowStyle(WS_CHILD | WS_CLIPSIBLINGS);
g_vidWin->put_MessageDrain((OAHWND)hWnd);
I tried setting Frame rate to different values ( AvgTimePerFrame = 500000 ( 20 fps ) and 666667(15 fps) etc.
But all the trials, still give the same result. Clarity has become more, but some jerks still remain and CPU is almost 100% due to 60+ % utlilization by "System". When I close my video application, usage by "System" goes back to 1-2 %.
Any help on this is most welcome.
Thanks in advance,
Use IAMStreamConfig.SetFormat() to select the frame rate, dimensions, color space, and compression of the output streams (Capture and Preview) from a capture device.
Aside: The comment above of "It doesn't change the source filter's own rate" is completely wrong. The whole purpose of this interface is to define the output format and framerate of the captured video.
Use IAMStreamConfig.GetStreamCaps() to determine what frames rates, dimensions, color spaces, and compression formats are available. Most cameras provide a number different formats.
It sounds like the fundamental issue you're facing is that USB bandwidth (at least prior to USB3) can't sustain 30fps 1080P without compression. I'm most familiar with the Microsoft LifeCam Studio family of USB cameras, and these devices perform hardware compression to send the video over the wire, and then eat up a substantial fraction of your CPU on the receiving end converting the compressed video from Motion JPEG into a YUV format. Presumably the Logitech cameras work in a similar fashion.
The framerate that cameras produce is influenced by the additional workload of performing auto-focus, auto-color correction, and auto-exposure in software. Try disabling all these features on your camera if possible. In the era of Skype, camera software and hardware has become less attentive to maintaining a high framerate in favor of better image quality.
The DirectShow timing model for capture continues to work even if the camera can't produce frames at the requested rate as long as the camera indicates that frames are missing. It does this using "dropped frame" count field which rides along with each captured frame. The sum of the dropped frames plus the "real" frames must equal the requested frame rate set via IAMStreamConfig.SetFormat().
Using the LifeCam Studio on an I7 I have captured at 30fps 720p with preview, compressed to H.264 and written an .mp4 file to disk using around 30% of the CPU, but only if all the auto-focus/color/exposure settings on the camera are disabled.
Related
I have an audio playing app that runs on several distributed devices, each with their own clock. I am using QAudioOutput to play the same audio on each device, and UDP broadcast from a master device to synchronize the other devices, so far so good.
However, I am having a hard time getting an accurate picture of "what is playing now" from QAudioOutput. I am using the QAudioOutput bufferSize() and bytesFree() to estimate what audio frame is currently being fed to the sound system, but the bytesFree() value progresses in a "chunky" fashion, so that (bufferSize() - bytesFree()) / bytesPerFrame doesn't give the number of frames remaining in the buffer, but some smaller number that bounces around relative to it.
The result I am getting now is that when my "drift indicator" updates, it will run around 0 for several seconds, then get indications in the -15 to -35 ms range every few seconds for maybe 20 seconds, then a correcting jump of about +120ms. Although I could try to analyze this long term pattern and tease out the true drift rate (maybe a couple of milliseconds per minute), I'd much rather work with more direct information if it's available.
Is there any way to read the true number of frames remaining in the QAudioOutput buffer while it is playing a stream?
I realize I could minimize my problems by radically reducing the buffer size and feeding QAudioOutput with a high priority process, but I'd rather have a solution that uses longer buffers and isn't so fussy about what it runs on - target platforms vary from Windows 10 machines to Raspberry Pi Zero Ws, possibly to Android phones.
I was recently playing a video on YouTube, and a thought came to my mind. As videos are played, a user can skip ahead in the video and the video just resumes at that point without any trouble.
What I can't seem to find is how this works, I know that when I request a file through HTTP it downloads the entire thing, so starting a binary stream halfway through the video doesn't seem possible using HTTP. Is there any RFC or related document on how browsers do this?
Thank you
There are a couple of different technologies but they all essentially allow you specific an offset in the video and then download a 'chunk' from there.
The simple way to do this is with byte ranges and HTTP progressive download 0 there is an RFC which covers this:
https://www.rfc-editor.org/rfc/rfc7233
A similar but slightly more complex mechanism is behind the various adaptive bit rate protocols, such as HLS, MPEG-DASH, Smooth-streaming etc. These protocols break a video into multiple chunks (e..g 10 second long segments) and also create several different encoding of the video each at different bit rates.
The client can then request the next chunk based on current network conditions - if the network is busy or if the client is using a low band witch connection it can request the next chunk from a low bit rate encoding of the video. If network connectivity improves then it can request from progressively higher bit rates until it reaches the maximum.
You can see this in action if you look at the 'stats for nerds' available in YouTube if you right click on the video - look at the connection speed graph.
This mechanism also means the client can request chunks from further ahead (or behind) than the current position in the video - so long as it is not live obviously!
It also allows faster start up if you do jump ahead as the playback can start from a lower bit rate which is faster to download and work up to the higher bit rate again. You can often see this when playing around with services like Netflix - if you jump ahead it may be lower quality for a little time initially.
YouTube stores the videos in several chunks. Once each chunk completes its download, you can play that chunk of video. Think of them as individual split videos.
When you try to jump into the middle, they will start downloading the necessary chunk of video and start playing. Therefore, you can jump to the middle.
I am developing a application in c# by using Directshow.NET. I am using a virtual camera which will help to record desktop screen. So my graph is:
Virtual cam --->color space converter --->sample grabber ---> ASF writer.
While coding, I used a custom .prx which I generated by Windows Media Profile Editorand configured into IConfigAsfWriterusing WMCreateProfileManager.
In .prx file Mode is CBR, codec Windows Media Video 9 and frame rate 15fps with 759Kbps video bit rate, but still video looks so blurry. If I increase video bit rate like upto 5Mbps then this blurriness is not coming but increasing bit rate results into large file size (54 seconds of recording file size is 10MB).
I tried another graph using graphEdit virtual cam ---> AVI mux ---> File writer but this also generating large .avi file.
How can i record video without blur effect by keeping minimum file size, for eg. 1 minute of video size up to 2-3 MB?
Do i need to use any video compressor?
The quality depends on the codec you use, but also on the number of bits per pixel. You can calculate it this way:
bits/pixel = bitrate / (width * height * framerate)
(bitrate in bits/second and framerate is in frames/second)
So if you want to reduce the bitrate without getting blurry video, you also have to reduce the resolution or framerate. That way you keep the number of bits per pixel the same.
I'm using two custom push filters to inject audio and video (uncompressed RGB) into a DirectShow graph. I'm making a video capture application, so I'd like to encode the frames as they come in and store them in a file.
Up until now, I've used the ASF Writer to encode the input to a WMV file, but it appears the renderer is too slow to process high resolution input (such as 1920x1200x32). At least, FillBuffer() seems to only be able to process around 6-15 FPS, which obviously isn't fast enough.
I've tried increasing the cBuffers count in DecideBufferSize(), but that only pushes the problem to a later point, of course.
What are my options to speed up the process? What's the right way to do live high res encoding via DirectShow? I eventually want to end up with a WMV video, but maybe that has to be a post-processing step.
You have great answers posted here to your question: High resolution capture and encoding too slow. The task is too complex for the CPU in your system, which is just not fast enough to perform realtime video encoding in the configuration you set it to work.
I want to use a camera which is installed in my computer in a Flex AIR application i'm writing, and I have few questions regarding the quality options :
Do I have any limitation on the video image quality? If my camera supports HD recording, will I be able to record in HD format via the AIR application?
How can I export the recorded video in any format I want?
If I want to use the same camera for shooting stills, how can I ensure (within the code) the best quality for the result pictures ?
Thanks for your answers.
1) Air can play 1080p no prob, however it cannot 'record' it from the application. There are some workarounds but you won't have sound. This is something that the OS should handle.
2) You can't, see above.
3) Shooting stills with a camera for picture quality is done on the hardware side, not software. In the software, it would essentially be equal to just taking one frame out of the video.