I am currently working on a video to video application, which uses the VLCJ api 2.2.0 for the media streaming.
What I want to do is calculate the dropped frames of the remote video stream. Specifically, i have set an upper limit of maximum FPS, and thus the calculation should be :
lostFPS = maximumFPS - currentFPS.
I saw in the javadoc of vlcj that the currentFPS is provided by the getFPS function, but it always returns 0 for some reason even if the video is streamed normally (both for local and remote).
Does anyone know if there are alternative ways of calculating this loss or am i missing something?
Best regards,
giannis
libVLC provides statistics about the currently playing media, vlcj exposes it:
libvlc_media_stats_t stats = mediaPlayer.getMediaStatistics();
int droppedFrames = stats.i_lost_pictures;
Related
I'm trying to build simple app that would stream video from camera using browser to the remote server.
For the camera access from browser I've found a wonderful WebRTC API: getUserMedia.
Now for the streaming it to the server IIUC the best way would be to use some of the WebRTC_API for transporting and then use some server side library to deal with it.
However, at first I went with a bit different approach:
I've user MediaRecorder based on the stream from camera. And then I was setting the timeslice for the MediaRecorder.start() to be few hundred Ms, e.g. 200. And I had some assumptions in wrt MediaRecorder which are not in sync with what I was observing:
I've observed weird behaviour(wrt to my assumptions about MediaRecorder):
If there was only 1 chunk uploaded to server -> it opens just fine.
If there are multiple chunks -> none of them loads correctly, they give errors: Could not determine type of stream. But then if I use ffmpeg to concat all the chunks - resulting file is fine. Same happens if I'm concatenating the blobs from MediaRecorder.ondataavailable on the client.
Thus the question:
Can the chunks in theory be independent video files? Or it is not what MediaRecorder was designed for? If it is not - then why do we even have the option to give timeslice parameter in its start() method?
Bonus question
If we're setting timeslice comparatively small, e.g. 10ms -> lots of data blobs that are sent to MediaRecorder.ondataavailable are of size 0. Where we can find some sort of guarantees/specs on the minimal timeslice that we can use, so that the data blobs are meaningful?
In the documentation there are the following:
If timeslice is not undefined, then once a minimum of timeslice milliseconds of data have been collected, or some minimum time slice imposed by the UA, whichever is greater, start gathering data into a new Blob blob, and queue a task, using the DOM manipulation task source, that fires a blob event named dataavailable at recorder with blob.
So, my guess is that it is somehow related to some data blobs being of 0 size. What does it "some minimum time slice imposed by the UA" mean?
PS
Happy to provide code if needed. But the question is not about some specific code. It is to get understanding of the assumptions behind the MediaRecorder API and why they are there.
The timeslice parameter does not allow to create independent media chunks; instead, it gives an opportunity to save data (e.g. on the filesystem, or uploaded to a server) on a regular basis, rather than holding potentially large media content in memory.
I was recently playing a video on YouTube, and a thought came to my mind. As videos are played, a user can skip ahead in the video and the video just resumes at that point without any trouble.
What I can't seem to find is how this works, I know that when I request a file through HTTP it downloads the entire thing, so starting a binary stream halfway through the video doesn't seem possible using HTTP. Is there any RFC or related document on how browsers do this?
Thank you
There are a couple of different technologies but they all essentially allow you specific an offset in the video and then download a 'chunk' from there.
The simple way to do this is with byte ranges and HTTP progressive download 0 there is an RFC which covers this:
https://www.rfc-editor.org/rfc/rfc7233
A similar but slightly more complex mechanism is behind the various adaptive bit rate protocols, such as HLS, MPEG-DASH, Smooth-streaming etc. These protocols break a video into multiple chunks (e..g 10 second long segments) and also create several different encoding of the video each at different bit rates.
The client can then request the next chunk based on current network conditions - if the network is busy or if the client is using a low band witch connection it can request the next chunk from a low bit rate encoding of the video. If network connectivity improves then it can request from progressively higher bit rates until it reaches the maximum.
You can see this in action if you look at the 'stats for nerds' available in YouTube if you right click on the video - look at the connection speed graph.
This mechanism also means the client can request chunks from further ahead (or behind) than the current position in the video - so long as it is not live obviously!
It also allows faster start up if you do jump ahead as the playback can start from a lower bit rate which is faster to download and work up to the higher bit rate again. You can often see this when playing around with services like Netflix - if you jump ahead it may be lower quality for a little time initially.
YouTube stores the videos in several chunks. Once each chunk completes its download, you can play that chunk of video. Think of them as individual split videos.
When you try to jump into the middle, they will start downloading the necessary chunk of video and start playing. Therefore, you can jump to the middle.
I was using a callback mechanism to grab the webcam frames in my media application. It worked, but was slow due to certain additional buffer functions that were performed within the callback itself.
Now I am trying the other way to get frames. That is, call a method and grab the frame (instead of callback). I used a sample in CodeProject which makes use of IVMRWindowlessControl9::GetCurrentImage.
I encountered the following issues.
In a Microsoft webcam, the Preview didn't render (only black screen) on Windows 7. But the same camera rendered Preview on XP.
Here my doubt is, will the VMR specific functionalities be dependent on camera drivers on different platforms? Otherwise, how could this difference happen?
Wherever the sample application worked, I observed that the biBitCount member of the resulting BITMAPINFOHEADER structure is 32.
Is this a value set by application or a driver setting for VMR operations? How is this configured?
Finally, which is the best method to grab the webcam frames? A callback approach? Or a Direct approach?
Thanks in advance,
IVMRWindowlessControl9::GetCurrentImage is intended for occasional snapshots, not for regular image grabbing.
Quote from MSDN:
This method can be called at any time, no matter what state the filter
is in, whether running, stopped or paused. However, frequent calls to
this method will degrade video playback performance.
This methods reads back from video memory which is slow in first place. This methods does conversion (that is, slow again) to RGB color space because this format is most suitable for for non-streaming apps and gives less compatibility issues.
All in all, you can use it for periodic image grabbing, however this is not what you are supposed to do. To capture at streaming rate you need you use a filter in the pipeline, or Sample Grabber with callback.
I am able to extract samples out of a video using the readSample method. Now how can I play the data present in those samples? Or how to play IMFSample ?
Sample IMFSample is a block of data, such as video frame or a chunk of audio sequence. This is a tiny piece of data to be played alone. The API addresses more sophisticated playback scenarios, such as where playback is a session where one or more streams are streamed in sync.
Be sure to check Getting Started with MFPlay on MSDN to see how playback is set up with Media Foundation.
I have to configure my video camera display resolution before capturing and processing the data. Initially I did it as follows.
Created all necessary interfaces.
Added camera and renderer filters
Did RenderStream with Capture and Preview PIN Categories.
Then did the looping through AM_MEDIA_TYPE structures and setting the params.
This worked for a lot of cameras, but a few cameras failed. Then I changed the order of 3 and 4 given above. That is, I did the setting of params before the RenderStream. This time, the error cases went through, but a few On board cameras in SONY VAIO laptop etc seem to fail.
Now, my questions are
Which is the optimal and correct method of getting and setting AM_MEDIA_TYPE parameters and running the graph?
If there are different cameras, if I get an indication of which order is the best for a particular camera by going through the camera's DirectShow interfaces, that will also serve my purpose.
Please help me in this at the earliest,
Thanks and regards,
Shiju
IAMStreamConfig::SetFormat needs to be used to set capture format before the pin is connected and rendered. This way the downstream subchain of filters is built with proper media types.