Vulkan: trouble understanding cycling of framebuffers - asynchronous

In Vulkan,
A semaphore(A) and a fence(X) can be passed to vkAcquireNextImageKHR. That semaphore(A) is subsequently passed to vkQueueSubmit, to wait until the image is released by the Presentation Engine (PE). A fence(Y) can also be passed to vkQueueSubmit. Client code can check when the submission has completed by checking fence(Y).
When fence(Y) signals, this means the PE can display the image.
My question:
How do I know when the PE has finished using the image after a call to vkQueuePresentKHR? To me, it doesn't seem that it would be by checking fence(X), because that is for client code to know when the image can be written to by vkQueueSubmit, isn't it? After the image is sent to vkQueueSubmit, it seems the usefulness of fence(X) is done. Or, can the same fence(X) be used to query the image availability after the call to vkQueuePresentKHR?
I don't know when the image is available again after a call to vkQueuePresentKHR, without having to call vkAcquireNextImageKHR.
The reason this is causing trouble for me is that in an asynchronous, 60fps, triple buffered app (throwaway learning code), things get out of wack like this:
Send an initial framebuffer to the PE. This framebuffer is now unavailable for 16 milliseconds.
Within the 16ms, acquire a second image/framebuffer, submit commands, but don't present.
Do the same as #2, for a third image. We submit it before 16ms.
16ms have gone by, so we vkQueuePresentKHR the second image.
Now, if I call vkAcquireNextImageKHR, the whole thing can fail if image #1 is not yet done being used, because I have acquired three images at this point.
How to know if image #1 is available again without calling vkAcquireNextImageKHR?

How do I know when the PE has finished using the image after a call to vkQueuePresentKHR?
You usually do not need to know.
Either you need to acquire a new VkImage, or you don't. Whether PE has finished or not does not even enter that decision.
Only reason wanting to know is if you want to measure presentation times. There's a special extension for that: VK_GOOGLE_display_timing.
After the image is sent to vkQueueSubmit, it seems the usefulness of fence(X) is done.
Well, you can reuse the fence. But the Implementation has stopped using it as soon as it was signaled and won't be changing its state anymore to anything, if that's what you are asking (and so you are free to vkDestroy it or do other things with it).
I don't know when the image is available again after a call to vkQueuePresentKHR, without having to call vkAcquireNextImageKHR.
Hopefully I cover it below, but I am not precisely sure what the problem here is. I don't know how to eat a soup without a spoon neither. Simply use a spoon— I mean vkAcquireNextImageKHR.
Now, if I call vkAcquireNextImageKHR, the whole thing can fail if image #1 >is not yet done being used, because I have acquired 3 images at this point.
How to know if image #1 is available again without calling >vkAcquireNextImageKHR?
How is it any different than image #1 and #2?
Yes, you may have already acquired all the images the swapchain has to offer, or the PE is "not ready" to give away an image even if it has two.
In the first case the spec advises against calling vkAcquireNextImageKHR with timeout of UINT64_MAX. It is a simple matter of counting the successful vkAcquireNextImageKHR calls vs the vkQueuePresentKHRs. One way may be to simply do one vkAcquireNextImageKHR and then do one vkQueuePresentKHR.
In the second case you can simply call vkAcquireNextImageKHR and you will eventually get the image.

In order to use a swapchain image, You need to acquire it. After that the actual availability of the image for rendering purposes is signaled by the semaphore (A) or the fence (X). You can either use the semaphore (X) during the submission as a wait semaphore or wait on the CPU for the fence (X) and submit after that. For performance reasons the semaphore is a preferred way.
Now when You present an image, You give it back to the Presentation Engine. From now on You cannot use that image for whatever purposes. There is no way to check when that image is available again for You so You can render into it again. You cannot do that. If You want to render into a swapchain image again, You need to acquire another image. And during this operation You once again provide a semaphore or a fence (probably different than those provided when You previously acquired a swapchain image). There is no other way to check when an image is again available than through calling the vkAcquireNextImageKHR() function.
And when You want to implement triple-buffering, You should just select appropriate presentation mode (mailbox mode is the closest match). You shouldn't wait for a specific time before You present an image. You just should present it when You are done rendering into it. Your synchronization should be entirely based on acquire, present commands and semaphores or fences provided during these operations and during submission. Appropriate present mode should do the rest. Detailed explanation of different present modes is available in Intel's tutorial.

Related

How can I make a program wait in OCaml?

I'm trying to make a tetris game in ocaml and i need to have a piece move through the graphics screen at a certain speed.
I think that the best way to do it is to make a recursive function that draws the piece at the top of the screen, waits half a second or so, clears that piece from the screen and redraws it 50 pixels lower. I just don't know how to make the programm wait. I think that you can do it using the Unix module but idk how..
Let's assume you want to take a fairly simple approach, i.e., an approach that works without multi-threading.
Presumably when your game is running it spends virtually all its time waiting for input from the user, i.e., waiting for the user to press a key. Most likely, in fact, you're using a blocking read to do this. Since the user can take any amount of time before typing anything (up to minutes or years), this is incompatible with keeping the graphical part of the game up to date.
A very simple solution then is to have a timeout on the read operation. Instead of waiting indefinitely for the user to press a key, you can wait at most (say) 10 milliseconds.
To do this, you can use the Unix.select function. The simplest way to do this is to switch over to using a Unix file descriptor for your input rather than an OCaml channel. If you can't figure out how to make this work, you might come back to StackOverflow with a more specific question.

is this the result of a partial image transfer?

I have code that generates thumbnails from JPEGs. It pulls an image from S3 and then generates the thumbs.
One in about every 3000 files ends up looking like this. It happens in batches. The high res looks like this and they're all resized down to low res. It does not fail on resize. I can go to my S3 bucket and see that the original file is indeed intact.
I had this code written in Ruby and ported it all over to clojure hoping it would just fix my issue but it's still happening.
What would result in a JPEG that looks like this?
I'm using standard image copying code like so
(with-open [in (clojure.java.io/input-stream uri)
out (clojure.java.io/output-stream file)]
(clojure.java.io/copy in out))
Would there be any way to detect the transfer didn't go well in clojure? Imagemagick? Any other command line tool?
My guess is it is one of 2 possible issues (you know your code, so you can probably rule one out quickly):
You are running out of memory. If the whole batch of processing is happening at once, the first few are probably not being released until the whole process is completed.
You are running out of time. You may be reaching your maximum execution time for the script.
Implementing some logging as the batches are processed could tell you when the issue happens and what the overall state is at that moment.

IMediaControl::Run followed by IMediaControl::Stop followed by IMeidaControl::Run doesn't switch on certain Onboard cameras

I have a DirectShow webcam application. I make use of Sample Grabber to get the buffer callbacks and IVideoWindow to control the display co-ordinates for the Preview. I have Preview and Capture Streams which I run as below.
g_pBuild->RenderStream(&PIN_CATEGORY_CAPTURE, &MEDIATYPE_Video,cam,g_pGrabberF,pNullRenderer2); g_pBuild->RenderStream(&PIN_CATEGORY_PREVIEW, &MEDIATYPE_Video,cam,NULL,NULL);
On certain On board cameras, IMediaControl::Run followed by IMediaControl::Stop followed by IMediaCOntrol::Run doesn't switch on the camera.
Extenal USB cameras work properly here. How can I diagnose more on this? Any pointers, please help.
Maybe its specific to a certain hardware issue in the unit.
Do a quick test by adding sleep of 1 sec between calls.
If it does help than you need to find a way to know when to unit state in idle or not.
There are two important parts of the question which you did not provide:
Filter graph topologies
HRESULTs of the method calls
A problem you might be having is that one of the filters in the topology does not handle well state transitions and fails somewhere between states. Supposedly your second Run meets it still trying to complete Stop. You might get a HRESULT there which indicates the issue (better for you) or the filter fails silently.
The filter graph's is the unlikely source of the bug itself. Chances are high that it does everything flawlessly, however since internally it distributes the calls between filters, one of the filter is letting you down.

Different approaches on getting captured video frames in DirectShow

I was using a callback mechanism to grab the webcam frames in my media application. It worked, but was slow due to certain additional buffer functions that were performed within the callback itself.
Now I am trying the other way to get frames. That is, call a method and grab the frame (instead of callback). I used a sample in CodeProject which makes use of IVMRWindowlessControl9::GetCurrentImage.
I encountered the following issues.
In a Microsoft webcam, the Preview didn't render (only black screen) on Windows 7. But the same camera rendered Preview on XP.
Here my doubt is, will the VMR specific functionalities be dependent on camera drivers on different platforms? Otherwise, how could this difference happen?
Wherever the sample application worked, I observed that the biBitCount member of the resulting BITMAPINFOHEADER structure is 32.
Is this a value set by application or a driver setting for VMR operations? How is this configured?
Finally, which is the best method to grab the webcam frames? A callback approach? Or a Direct approach?
Thanks in advance,
IVMRWindowlessControl9::GetCurrentImage is intended for occasional snapshots, not for regular image grabbing.
Quote from MSDN:
This method can be called at any time, no matter what state the filter
is in, whether running, stopped or paused. However, frequent calls to
this method will degrade video playback performance.
This methods reads back from video memory which is slow in first place. This methods does conversion (that is, slow again) to RGB color space because this format is most suitable for for non-streaming apps and gives less compatibility issues.
All in all, you can use it for periodic image grabbing, however this is not what you are supposed to do. To capture at streaming rate you need you use a filter in the pipeline, or Sample Grabber with callback.

How to get IMediaControl.Run() to start a file playing with no delay

I am attempting to use DirectShow to play two AVI files consecutively (one after the other) so that there is no interruption in the audio or video when the player transitions from one file to the next.
I have two custom controls on my form. Each one is pre-loaded with an AVI file, and before playback begins I set up all the DirectShow interfaces, set the video windows and resize them, call IMediaControl.Run(), then IMediaControl.Pause(), then IMediaSeeking.SetPositions to reset to frame 0, on both controls. On the form, you can see that both files are paused at their initial frames.
I then call IMediaControl.Run() on the first control, and wait for it to complete before calling Run() on the second control. Initially, I hooked into the first video's EC_COMPLETE notification message, and used this to start the second. Thinking that this event might be slow to arrive (turns out it is, but for a weird reason), I tried two other approaches:
Check the first video's current position inside a timer that goes off every second or so (using IMediaPosition.get_CurrentPosition). When the current position is within a second of the video's stop time (known in advance from IMediaPosition.get_StopTime), I go into a tight while loop and wait for the current position to equal the stop time, and then call Run() on the second video.
Same as the first, except I replace the while loop with a call to timeSetEvent from winmm.dll, with a delay set so that it fires right when the first file is supposed to end. I use the callback to Run() the second file.
Either of these two methods substantially cuts down the delay between the end of the first file and the beginning of the second, indicating that the EC_COMPLETE message doesn't arrive immediately after the file is complete (I also tried hooking the EC_SEGMENT_COMPLETE message, which is supposed to be used for looping within a file, but apparently nobody supports this - it never occurs on my machine, at least).
Doing all of the above has cut the transition delay from as much as a second, down to a barely perceptible glitch; about a third of the time the files transition with no interruption at all, which suggests there's no fundamental reason I can't get this to work all the time.
The slight delay is still unacceptable, unfortunately. I assume (and I could easily be wrong) that the remaining delay is due to a slight variable delay between the call to IMediaControl.Run() and when the video actually starts playing.
Does anybody know anything I can do to eliminate this little lag? It would also help to be told this is fundamentally impossible for whatever reason, which wouldn't surprise me. I've never encountered a video player in Windows that doesn't have this problem, so it may not be doable.
More info: the AVI files I'm playing are completely uncompressed (video and audio are uncompressed), so I don't think the lag is due to DirectShow's having to uncompress the video ahead of play start, although it may still buffer ahead as matter of course (and this may be the source of the problem). I would have though that starting play, pausing and then rewinding to the beginning would fix this.
Also, the way I'm handling the transition is to actually have the second control underneath the first; when the first completes playing, I start the second and then call BringToFront on it, creating the appearance of a single video transitioning between the two originals. I don't think the glitch is due to this, because it works perfectly some of the time, and even if this were problematic, it wouldn't explain the matching audio glitch.
Even more: I just tried starting the second video 30-50 milliseconds "early" and that seemed to eliminate even more of the gap, so I'm guessing that the lag in Run() is about that long. It appears to be variable, though, so this is still not where I need it to be.
Still more: perhaps I could eliminate this delay by loading the AVIs from memory rather than from a file. Unfortunately, I have no idea how to do this. IMediaControl only has a RenderFile() method, not something like a RenderStream or RenderMemory method.
If you call IMediaControl::Run on a stopped graph, the graph manager will post the call to a worker thread (so there's some variability). On the worker thread, the graph will be paused. Render filters only complete a pause transition once they have received data, so once GetState() returns S_OK, the graph manager knows that the graph is fully cued. At this point, it picks a time roughly 10ms into the future, and calls Run on each filter with that time as the start point. Since it takes time to tell each filter to Run, the dshow Run method has a parameter which is the refclock time at which a sample timestamped zero should be played -- i.e. the time at which the actual transition to run mode should take place.
To synchronise this with another graph, you first have to ensure that both graphs have the same clock. Query the graph (not the filter) for IMediaFilter, and call GetSyncSource on one graph and SetSyncSource on the other. Then you need to pause the second graph, so that it is cued and ready. When you want to start it, call IMediaFilter::Run instead of IMediaControl::Run, and you can pass your own start time. This still has to be a few milliseconds into the future, so the best thing might be to set the start time of the second graph to be the first graph's start time plus its duration (for an indexed container of uncompressed streams, the duration should be accurate).
Another approach is to use multiple graphs. Separating source from rendering would allow you to switch seamlessly between sources since they feed into a common render graph. There is sample source code for this approach at www.gdcl.co.uk/gmfbridge.
G

Resources