I was recently playing a video on YouTube, and a thought came to my mind. As videos are played, a user can skip ahead in the video and the video just resumes at that point without any trouble.
What I can't seem to find is how this works, I know that when I request a file through HTTP it downloads the entire thing, so starting a binary stream halfway through the video doesn't seem possible using HTTP. Is there any RFC or related document on how browsers do this?
Thank you
There are a couple of different technologies but they all essentially allow you specific an offset in the video and then download a 'chunk' from there.
The simple way to do this is with byte ranges and HTTP progressive download 0 there is an RFC which covers this:
https://www.rfc-editor.org/rfc/rfc7233
A similar but slightly more complex mechanism is behind the various adaptive bit rate protocols, such as HLS, MPEG-DASH, Smooth-streaming etc. These protocols break a video into multiple chunks (e..g 10 second long segments) and also create several different encoding of the video each at different bit rates.
The client can then request the next chunk based on current network conditions - if the network is busy or if the client is using a low band witch connection it can request the next chunk from a low bit rate encoding of the video. If network connectivity improves then it can request from progressively higher bit rates until it reaches the maximum.
You can see this in action if you look at the 'stats for nerds' available in YouTube if you right click on the video - look at the connection speed graph.
This mechanism also means the client can request chunks from further ahead (or behind) than the current position in the video - so long as it is not live obviously!
It also allows faster start up if you do jump ahead as the playback can start from a lower bit rate which is faster to download and work up to the higher bit rate again. You can often see this when playing around with services like Netflix - if you jump ahead it may be lower quality for a little time initially.
YouTube stores the videos in several chunks. Once each chunk completes its download, you can play that chunk of video. Think of them as individual split videos.
When you try to jump into the middle, they will start downloading the necessary chunk of video and start playing. Therefore, you can jump to the middle.
Related
My application must read one video track and several audio tracks, and be able to specify one section of the file and play it in loop. I have created a setup with Media Foundation, using the sequencer source and creating several topologies with the start and end point of the section I want to loop. It works, except for the fact that there is a 0.5 to 1 sec time of stabilization of the playback just when it goes back to the starting point.
First, I made it with individual audio files and one video file. This was quite bad for some files, sometimes all the files were completely out of sync, sometimes the video was frozen for several seconds, then went very fast to catch with the audio.
I had a good improvement using only one file, that includes the video and the multiple audio tracks. However, for most files, there is still a problem about the smoothness of the transition.
With a poor quality video AVI file, I could make it work smoothly, which would mean that the method I use is correct. I have noticed that the quality of the loop smoothness is strongly related to the CPU used on a file when simply playing it.
I use the "SetTopology" on the session, using a series of topologies, so normally it should preroll the next one during the playback of the current one, right ? Or am I missing something there ?
My app works also on Mac, where I have used a similar setup with AVFoundation, and it works fine with the same media files I use on Windows.
What can I do to have the looping work smoothly with better quality video on Windows ? Is there something to do about it ?
When I play the media file without looping, I notice that when I preroll it to some point, then when I hit the START button, the media starts instantly and with no glitch. Could it work better if I was using two independent simple playback setups, start the first, preroll the second, then stop the first and start the second programmatically at the looping point ?
I'm using open source Torque 3d game engine for the avia simulator project.
I need to generate single image from the several IG (image generator) PCs.Each IG displays has its own view camera with certain angle offset and get the info about the current position from the server via LAN.
I've already setup multi IG system.
Network connection is robust (less than <1 ms)
Frame rate is good as well - about 70 FPS on each IG.
However while moving the whole picture looks broken because some IG are updating their views faster than others.
I'm looking for the solution that will make the IG update simultaneously. Maybe some kind of precise time synchronization algorithms that make different PC connected via LAN act as one.
I had a much simpler problem, but my approach might help you.
You've got to run clocks on all your machines with, say, a 15 millisecond tick. Each image needs to be generated correctly for a specific tick and marked with its tick ID time. The display machine can check its own clock, determine the specific tick number (time) for which it should display, grab the images for that specific time, and display them.
(To have the right mindset to think about this, imagine your network is really bad and think about one IG delivering 1000 images ahead of the current display tick while another is 5 ticks behind. Write for this sort of system and the results will look really good on the one you have.)
Ideally you want your display running a bit behind the IGs so you always have a full set of images for the current tick. I had a client-server setup and slowed the display (client) timer down if it came close to missing updates and sped it up if it was getting too far behind. You have to synchronize all your IG machines, so it might be better to have the master clock on the display and have it send messages to speed up any IG machine that's getting behind. (You may not have the variable network delays I had, but it's best to plan for them.)
The key is that each image must be made at a particular time, that the display include only images for the time being displayed, and that the composite images appear right when they should (every 15 milliseconds, on the millisecond). Also, do not depend on your network or even your machines to do anything in a timely manner. Use feedback to keep everything synched.
Addition On Feedback:
Say the last image for the frame at time T arrives 5ms after time T by the display computer's time (real time). If you display the frame for time T at T plus 10 ms, no one will notice the lag and you'll have plenty of time to assemble the images. Using a constant (10 ms) delay might work for you, especially if you make it big enough. It may be the way to go if you always run with the exact same network.
But you are depending on all your IG machines being precisely synchronized for real time, taking no more than a certain amount of time to produce their image, and delivering their image to the display machine all in predictable lengths of time.
What I'd suggest is have your display machine determine the delay based on the time stamps on the images it receives. It would want to increase the delay if it isn't getting the images it needs in time, and decrease it if all the IG's are running several images ahead of what the display needs. (You might want to ignore the occasional really late image. You have to decide which is more annoying: images that are out-of-date, a display that is running noticably behind time, or a display that noticably speeds up and slows down.)
In my original answer I was suggesting some kind of feedback from the display to keep the IG machines running on time, but that may be overkill: your computer's clocks are probably good enough for that.
Very generally, when any two processes have to coordinate over time, it's best if they talk to each other to stay in step (feedback) rather than each stick to a carefully timed schedule.
I am creating an app which uses sockets to send data to other devices. I am using Http protocol to send and receive data. Now the problem is, i have to stream a video and i don't know how to send a video(or stream a video).
If the user directly jump to the middle of video then how should i send data.
Thanks...
HTTP wasn't really designed with streaming in mind. Honestly the best protocol is something UDP-based (SCTP is even better in some ways, but support is sketchy). However, I appreciate you may be constrained to HTTP so I'll answer your question as written.
I should also point out that streaming video is actually quite a deep topic and all I can do here is try to touch on some of the approaches that you might want to investigate. If you have control of the end-to-end solution then you have some choices to make - if you only control one end, then your choices are more or less dictated by what's available at the other end.
If you only want to play from the start of the file then it's fairly straightforward - make a standard HTTP request and just start playing as soon as you've buffered up enough video that you can finish downloading the file before you catch up with your download rate. You don't need any special server support for this and any video format will work.
Seeking is trickier. You could take the approach that sites like YouTube used to take which is to simply not allow the user to seek until the file has downloaded enough to reach that point in the video (or just leave them looking at a spinner until that point is reached). This is not the user experience that most people will expect these days, however.
To do better you need to be in control of the streaming client. I would suggest treating the file in chunks and making byte range requests for one chunk at a time. When the user seeks into the middle of the file, you can work out the byte offset into the file and start making byte range requests from that point.
If the video format contains some sort of index at the start then you can use this to work out file offsets - so, your video client would have to request at least enough to get the index before doing any seeking.
If the format doesn't have any form of index but it's encoded at a constant bit rate (CBR) then you can do an initial HEAD request and look at the Content-Length header to find the size of the file. Then, if the use seeks 40% of the way through the video, for example, you just seek to 40% of the way through the encoded frames. This relies on knowing enough about the file format that you can calculate an appropriate seek point so that you can identify framing information and the like (or at least an encoding format which allows you to resynchonise with both the audio and video streams even if you jump in at an arbitrary point in the file). This approach might also work with variable bit rate (VBR) as long as the format is such that you can recover from an arbitrary seek.
It's not ideal but as I said, HTTP wasn't really designed for streaming.
If you have control of the file format and the server, you could make life easier by making each chunk a separate resource. This is how Apple HTTP live streaming and Microsoft smooth streaming both work. They need tool support to pre-process the video, and I don't know if you have control of the server end. Might be worth looking into, however. These also do more clever tricks such as allowing a client to switch between multiple versions of the stream encoded at different bit rates to cope with differences in bandwidth.
I'm using two custom push filters to inject audio and video (uncompressed RGB) into a DirectShow graph. I'm making a video capture application, so I'd like to encode the frames as they come in and store them in a file.
Up until now, I've used the ASF Writer to encode the input to a WMV file, but it appears the renderer is too slow to process high resolution input (such as 1920x1200x32). At least, FillBuffer() seems to only be able to process around 6-15 FPS, which obviously isn't fast enough.
I've tried increasing the cBuffers count in DecideBufferSize(), but that only pushes the problem to a later point, of course.
What are my options to speed up the process? What's the right way to do live high res encoding via DirectShow? I eventually want to end up with a WMV video, but maybe that has to be a post-processing step.
You have great answers posted here to your question: High resolution capture and encoding too slow. The task is too complex for the CPU in your system, which is just not fast enough to perform realtime video encoding in the configuration you set it to work.
I'm writing software which is demonstraiting video on demand service. One of the feature is something similiar to IIS Smooth Streaming - I want to adjust quality to the bandwith of the client. My idea is, to split single movie into many, let's say - 2 seconds parts, in different qualities and then send it to the client and play them. The point is that for example first part can be in very high quality, and second in really poor (if the bandwith seems to be poor). The question is - do you know any software that allows me to cut movies precisly? For example ffmpeg splits movies in a way that join is visible and really annoying (seconds are the measure of precision). I use qt + phonon as a player if it matters. Or maybe you know any better way to provide such feature, without splitting movie into parts?
Are you sure ffmpeg's precision is in seconds? Here's an excerpt from the man page:
-t duration
Restrict the transcoded/captured video sequence to the duration specified in seconds. "hh:mm:ss[.xxx]" syntax is also supported.
-ss position
Seek to given time position in seconds. "hh:mm:ss[.xxx]" syntax is also supported.
-itsoffset offset
Set the input time offset in seconds. "[-]hh:mm:ss[.xxx]" syntax is also supported. This option affects all the input files that follow it. The offset is added to the timestamps of the input files. Specifying a positive offset means that the corresponding streams are delayed by 'offset' seconds.
Looks like it supports up to millisecond precision, and since most video is not +1000 frames per second, this would be more than enough precision to accurately seek through any video stream.
Are you sure this is a good idea? Checking the bandwidth and switching out clips every two seconds seems like it will only allow you to buffer two seconds into the future at any given point, and unless the client has some Godly connection, it will appear extremely jumpy.
And what about playback, if the user replays the video? Would it recalculate the quality as it replays, or do you build the video file while streaming?
I am not experienced in the field of streaming video, but it seems what I see most often is that the provider has several different quality versions of their video (from extremely low to HD), and they test the user's bandwidth and then stream at an appropriate quality.
(I apologize if I misunderstood the question.)