DWM in Win7/8 + GDI - gdi

A problem I've noticed once on my Win7 system, thought it was a DWM bug as it was fixed after a reboot. But now I'm realizing that it's happening (as default behavior) on other people's systems, and it's the normal behavior on Surface Pro as well.
How to reproduce the problem: implement, using GDI, a basic lasso system. Define a rectangle controlled by the mouse, when the rectangle changes, invalidate the old one & the new one (or invalidate a union of both rectangles, either as a new rectangle or a complex region, it doesn't matter, the "bug" still shows anyway).
During wm_paint, you just erase the background & paint the rectangle (it has to be a rectangle outline, the problem won't be visible if it's a filled rectangle). You can do double-buffering if you wanna be sure that it's not a flickering problem (& trust me it's not).
So what you'll see, if you have a system like mine (desktop Win7 with a geforce, aero on), is a normal lasso system, with no more ghosting than the monitor's own.
On other systems (like Surface Pro, to define a fully known system), you'll see, as you extend the lasso outwards, the border of the lasso disappearing. A bit like LCD ghosting, but massively more noticable.
Now, instead of invalidating the lasso's rectangle, try invalidating the whole window. And there, no more ghosting.
What I found out is that it's not the invalidation that "fixes" it, it's GDI access. You can as well invalidate the whole rectangle, but only paint the lasso's zone, still ghosting. But if you paint the lasso's zone and draw a single little pixel on each corners of the window, no more ghosting.
There must be something in DWM, probably since version 1.1, that uses some kind of caching of the bounding box of the last GDI access, and for some weird reason, what falls within the last bounding box will get immediately on screen, while the new part will be delayed by at least 1 frame.
This is pretty bad because it's breaking very basic window invalidation that everyone uses, and I haven't found any way to fix it (other than by invalidating the whole window of course, but that'd be stupid, & besides it's a problem that affects the whole GDI, so you get poor visual results everywhere).
Again it's most likely in DWM 1.1, I don't think you can get this in Vista, but I'm not sure. I also don't know why it doesn't do that on my desktop, possibly it depends on the graphic card's driver.
So if anyone happens to know more about this...

An update on this. I didn't find any clean solution, but a hack that works.
First, I also discovered that this "bug" seems to affect all systems, just not the same way.
Using DwmGetCompositionTimingInfo, once can make v-synced GDI animation. That's in theory, because in practice it won't work, because of the same "bug", even on systems not visibly affected by what I describe above. The DWM will simply decide not to refresh anything when there isn't enough to fresh, and that will cause frameskips when scrolling a window using ScrollWindowEx and not enough pixels are invalidated.
So what's the solution? Fool the DWM into thinking that it has to swap buffers immediately, as it's what the whole problem is about. When doing GDI operations, one might think that the result will appear at the DWM's next vblank-synced refresh, but it's not the case. Some parts will, some will be delayed. So the trick is to force the DWM to swap, here's what I found about it:
-first, DWMFlush does not do that (nor does GDIFlush). DWMFlush is more or less a WaitForVBlank anyway
-DWMSetPresentParameters doesn't seem to allow this either, even though I didn't bother that much with that function since it's already gone in Windows 8
-the DWM seems to refresh when there are enough pixels to swap, but it also seems to be partitionned, quite possibly into wide but short rectangles (that's what it appears to be on Surface Pro - but not on my desktop). Whether it's related to apertures to the VRAM or segmentation into small textures, I have no idea, really, maybe someone knows?
So what works for me: telling the GDI to refresh vertical stripes, 1 pixel wide, about 500 pixels apart, all over the screen. If you only do it on parts of the screen, you'll still get flicker on other parts.
It also works using horizontal stripes but then you'll need a lot more of them, that's why I believe that the segmentation is done in wide, short rectangles. You can kinda see those rectangles when tricking the DWM.
Which GDI functions work? Many do, but they don't all have the same CPU usage.
First, forget GetPixel, it works but the CPU usage is obviously extremely high.
Second, the GDI seems to be intelligent enough to detect things that can be buffered just as commands, so filling a rectangle with a null brush, or drawing empty text, won't force the DWM to refresh.
What does work is BitBlt or AlphaBlend using empty vertical stripes. The CPU usage is still ok, even though it's not so far from blitting a whole screen.
It has to be done on a top-level window, not the desktop.
Creating a Direct2D render target for the DC & doing a begin/enddraw also works, but it's normal since it will force a refresh of the full rectangle, & thus the CPU cost is higher.
If anyone knows a better way to force a DWM refresh, I'd like to know. But since Windows 8 already made obsolete most of the interesting DWM functions (fortunately DwmGetCompositionTimingInfo still partially works, or it would be impossible do vsynced timers without DirectX), I'm not sure there's any better way.
Also, it doesn't have to be done on all top-level windows. You can see the effect working on a Windows desktop blue lasso when it gets near a top-level window that invalidates stripes, it stops flickering as soon as it enters a zone near it (the segmentation I'm talking of above).

Here's a couple of general GDI-related pointers regarding your code:
When you call InvalidateRect API it may not immediately dispatch WM_PAINT notification, so calling InvalidateRect two times in a row like you did in your TForm1.FormMouseMove method is most certainly causing this visual effect. The first redrawing is not yet processed when you call it the second time.
I'm not sure what Canvas.Rectangle exactly does "on the inside" or which APIs it calls. I'm assuming it is using DrawFocusRect, in which case you have to be aware that its drawing is XORed, so doing it twice over the same rectangle will erase it.
OK, so here's what I'd do if I were drawing that selection box:
Call InvalidateRect API only once. For that you will have to calculate the bounding rectangle that will include position of the selection box before and after the move. If I'm not mistaking, you can use UnionRect API for that, or just calculate the bounding rect yourself.
If avoiding a visual lag is important, I'd do all the drawing of the selection box in TForm1.FormMouseMove. You can get a device context by calling GetDC API on your window handle. After that do the same drawing. In this case you will not need any invalidation. (Sorry, I can't give you a procedure for Delphi. This is how I'd do it with plain WinAPIs.)
EDIT: After reviewing your C++ code I was able to reproduce the visual flicker you're describing. I also made two C++ projects, derived from your original code in hopes of solving it: Here's a simple version and here's the one with mouse dragging support. None of them solved the issue though.
I was able to reproduce the flicker on a Windows 7 desktop. After having done some tests I am convinced now that this visual artifact is caused by a display driver. In my case the desktop has ATI Radeon video card. This is how I was able to conclude that: I ran the test executable on a different (older) desktop with Windows XP OS and the flicker was not present. I ran the exact same executable in a virtual machine with Windows XP installed in it on the desktop computer that was producing the flicker. The flicker was present when the executable ran in a Windows XP in that virtual machine. Since the same video driver is responsible for rendering the actual desktop and the one in the virtual machine, there's most definitely some "funny business" going on with optimization or caching in the video driver, that is causing this artifact.
So, the question now is how to resolve it. In my second build of your project, I added the code to render the whole client area for the window to eliminate the possibility of a calculation error, but that did not help. So at this point you're helpless to resolve it via plain APIs.
Your next steps should probably be these:
Contact the driver maker for the video card that you're experiencing this issue on. See if they help. I'd show them your YouTube video of the issue and give them the C++ project to reproduce it with.
Post a question on Windows Driver Development forums, preferably for video driver developers. Unfortunately it's a dwindling community, so expect long delays.
Change the tags on your question here. Add these: C, C++, WinAPI, GDI, DWM and remove what you have there now. This way it will be visible for a Win32 dev community so you'll get more views.
Other than that, good luck! This is such a minor bug (even if one can call it that) that I doubt that you'll get any serious attention to it. Although this is very admirable, in my book, that you're trying to achieve such perfection for your software.
Also, if you need me to second you on this, I can do so.

Related

JavaFX full screen problem on a CentOS without desktop environment

For my case I'm not able to give much detail for a start due to different reasons, but I decided to go straight to the point and ask. First I'd like to point out that this is a very specific question, to prevent unrelated general answers.
For a kiosk JavaFX 8 application that needs to be run on CentOS 6.10 without a desktop environment, but only with XOrg / X-Server for graphics support, there seems no way to prevent that an initial login window goes full screen. This login window's layout is defined via FXML as a medium sized rectangle which works fine on a desktop suited environment, without code differences. I have tried to call Stage's setWidth() and setHeight() methods (planning to try max variants soon) before and after showing it via show() method.
Does anyone have any quick idea about what issue could be causing this at first glance? I may provide other details on demand. Nonetheless, I will post any solution I come up with.
Thanks
EDIT: The current XOrg version on the affected machine seems to be xorg-x11-server-Xorg-1.17.4-17.el6.centos.i686
Problem (forced full screen windows) was caused by an existing invokation to dwm within .xinitrc file, used for a different existing application

Fast (HW-accelerated) drawing to foreign window (probably using Direct3D)

I am working on certain project where the task is to paint bitmaps (currently HBITMAP/bitblt/alphablend) into non-client areas of all visible windows (i.e. windows which do not belong to my application). It must be done very fast - bitmap is update when window is moved, resized etc. Also some kind of blur algorithm must be applied on this bitmap. It should work on Win7 and Win8 (so XP is not required).
I have managed to work properly with GDI. I obtain GetWindowDC, GetWindowRect and AlphaBlend bitmap into buffer (CreateCompatibleDC/CreateCompatibleBitmap) and then BitBlt it into GetWindowDC. This works perfectly... except... it is not as fast as I want. And if I apply blur algorithm on the bitmap, then everything is slow as hell.
So my question would be how to improve the speed? I'm thinking about hardware accelerated drawing.
a) I tried GDI-compatible Direct2D (using ID2D1DCRenderTarget+BindDC) but it is much slower than pure GDI.
b) I am thinking about Direct3D. Problem is that I probably don't know how to use it. If I call D3D10CreateDeviceAndSwapChain with swapchain's OutputWindow set to HWND of my application then it returns S_OK but when I set OutputWindow to HWND of any foreign window then the method fails. So I am not sure how to render into foreign windows.
c) how to properly apply blur on a part of image? I found many algorithms but all of them are processed on CPU. How to make it on GPU?
Thanks in advance for any idea how to solve my problem.
Have you thought about using DComp? For why using DComp may be appropriate, take a look at this: http://msdn.microsoft.com/en-us/library/windows/desktop/hh449195%28v=vs.85%29.aspx
For a brief summary of what DComp is (from MSDN):
Microsoft DirectComposition is a Windows component that enables high-performance bitmap composition with transforms, effects, and animations. Application developers can use the DirectComposition API to create visually engaging user interfaces that feature rich and fluid animated transitions from one visual to another.
DirectComposition enables rich and fluid transitions by achieving a high framerate, using graphics hardware, and operating independently of the UI thread. DirectComposition can accept bitmap content drawn by different rendering libraries, including Microsoft DirectX bitmaps, and bitmaps rendered to a window (HWND bitmaps). Also, DirectComposition supports a variety of transformations, such as 2D affine transforms and 3D perspective transforms, as well as basic effects such as clipping and opacity.
DirectComposition is designed to simplify the process of composing visuals and creating animated transitions. If your application already contains rendering code or already uses the recommended DirectX API, you only need to do a minimal amount of work to use DirectComposition effectively.

How can I get a smooth text crawl using Flex?

I'm working on a standalone Flash application (written using Flex 3/ActionScript 3) that features a text crawl, like what you might see at the bottom of your TV when watching a cable news channel; it's a long narrow box that text moves across from right to left.
I've implemented it by creating a Label element, populating it with text, and then moving it using a mx:Move object with a Linear.easeNone easing function. It works, but it has ample room for improvement. It looks a bit jerky, and tends to have a fair amount of "tearing" (the top and bottom halves of the text sometimes fall out of sync).
I tried throwing math at the problem to get the crawl's movement rate synced with the monitor's refresh rate, but that was a bust. I found out the hard way that the app's frame rate jumps around too much; the "optimized" crawl varied between looking silky smooth and like it had epilepsy.
Is there anything else folks would recommend I try to smooth this thing out? Is there some alternate design you'd recommend I try?
Edit: Some context: the crawl is part of a digital signage application (played from a standalone Flash projector -- no web browser) that does stuff elsewhere on the screen, including video playback and rendering text and images. It definitely gets choppier during video playback, but it's never as smooth as I'd like it to be.
There are two potential solutions to this problem, but both have caveats, the first because of your use of Flex and a standalone projector, the second because it is a mitigator, not a complete solution.
Hardware Acceleration
When publishing your file, you can attempt to have Flash utilize hardware acceleration to alleviate the vertical refresh issue you are running into that is causing tearing. Sadly, Flex Builder 3 is incapable of enabling this setting at the SWF (projector) level (Link to bug). This has yet to be resolved and has been pushed from 4.0 to 4.1 to 4.x... If and when it is resolved, it will likely be a compiler argument in the project settings of Flash Builder 4.
You may be able to determine if this solution works for you by outputting your projector as a standard SWF and embedding it on an HTML document with the wmode set to "direct" or "gpu". Sadly, if it does (it should), you can't use it right now anyway. If you have Flash Builder 4, certain projects are capable of making round trips between FB4 and Flash Professional CS5, though I am not sure what the criteria for that is (my current AIR project has all the project modification menu options grayed out). If you do manage to get your project into Flash, you can enable hardware acceleration in the Publish Settings of the project (File->Publish Settings->Flash tab->Hardware Acceleration option in CS5).
This method is almost a certain solution for your problem, though it has two issues, one already highlighted above, and (for people publishing for the web) that by utilizing direct or GPU rendering on a webpage, you are unable to layer any DOM elements on top of flash.
direct: This mode tries to use the fastest path to screen, or direct path if you will. In most cases it will ignore whatever the browser would want to do to have things like overlapping HTML menus or such work. A typical use case for this mode is video playback. On Windows this mode is using DirectDraw or Direct3D on Vista, on OSX and Linux we are using OpenGL. Fidelity should not be affected when you use this mode.
gpu: This is fully fledged compositing (+some extras) using some functionality of the graphics card. Think of it being similar to what OSX and Vista do for their desktop managers, the content of windows (in flash language that means movie clips) is still rendered using software, but the result is composited using hardware. When possible we also scale video natively in the card. More and more parts of our software rasterizer might move to the GPU over the next few Flash Player versions, this is just a start. On Windows this mode uses Direct3D, on OSX and Linux we are using OpenGL.
**Source*
Direct is the ideal option for this situation, as you can actually have performance degredation with "gpu" as well as visual differences from graphics card to graphics card.
Lower your framerate
The Flash player will continue to play video at its native refresh rate independent of the rest of your project as long as you keep the framerate at or above approximately 2FPS (though I suggest 5FPS minimum). You won't want to run that low for this example, but you are able to lower the framerate of the entire scene without impacting video performance. The closer your framerate is to the screen refresh rate, the more apt you are to actually create the tearing effect unless you are able to absolutely sync with the monitor's refresh rate, which you probably cannot do without the above... Hardware Acceleration.
This problem has existed in the Flash Player for as long as it has been able to move objects horizontally. What happens is that Flash updates a buffered snapshot of the running animation at the same time that the screen is refreshing. If the buffered snapshot changes partway through a screen refresh, you get a tear. This is why lowering the framerate actually reduces the amount of tearing, you are refreshing the buffer less frequently.
As #Tegeril mentioned, using Flex is one of the reasons. Flex is a pretty heavy framework and it does a lot of things behind the scenes. If you're familiar with the life cycle of a component(especially invalidating properties, invalidating the display list, etc.).
As a few minor things that might improve performance:
try to keep a simple display list. If you know the app will always be displayed at one size, then flex won't waste time traversing the display list/tree up to the top and back for measurements. Also, try to use a Canvas. I know, it's not very clean, but since it uses absolute values and doesn't check with the 'parents' much, it should be faster than other containers(like HBox,VBox, etc.)
try to display the video at it's full size(make sure the encoded video dimensions are right so there be any CPU cycles on resizing video
Ok, this was Flex stuff.
It might be very handy to read sencular's article on Asynchronous ActionScript Execution which explains how Flash Player handles updates and renders.
(source: senocular.com)
Frames both execute ActionScript and render the screen
(source: senocular.com)
ActionScript taking a long time to complete delays rendering
I imagine the jerkiness is related to this. Also, I'm guessing you might
get moments of smooth movement then sudden halts, every now and then, when
Flash Player catches it's breath(Garbage Collector cleans up)
Victor Drâmbă article on “Multithreading” in Actionscript might also
be useful.
Soo, to recap:
use Profiler or something and see if the Flex framework is slowing you down, or where the 'bottleneck' is
improve as much as you can on that side then check if it's how Flash Player handles all the actionscript('elastic' frames)
If the bottleneck comes from the Flex framework, worst case, you
can try to minimise the number of components that traverse the display list,
and use pure actionscript for the other things(as #PatrickS suggested, use TweenLite, etc.)
If it helps, try to preload data(fetch rss feed and all that) at the start, and when you've got most of the important bits that don't require 'refreshes'/loads frequently, display the app. You will use more memory, but will have more cpu cycles to spare for other tasks.
Also, if it's display objects that are the 'bottleneck' and there's plenty of them, check if you can reuse them using Object Pools.
HTH
TweenMax or even TweenLite ( http://www.greensock.com )handles this sort of job pretty well. What else is your app doing while the text is scrolling though? Is it possible that some other processes are interfering?
This may not be helpful, but have you considered putting the crawling text into the html DOM and using CSS transitions to crawl the text. Obviously there's the IE problem, but it should be supported in IE9 and you could use javascript as a fallback.
This may seem silly, but CSS transitions are getting hardware acceleration and separate processes for plugins meaning on a multicore machine you could get parallel threads.
One thing you might consider is to move your label incrementally using a Timer instead of an easing function. That way you can take advantage of the updateAfterEvent method to get smoother rendering. Here's a link to an article/video from Chet Haase (Adobe's Flex graphics dude) that explains usage along with an example app with code:
http://graphics-geek.blogspot.com/2010/04/video-event-performance-in-flex.html
Hope that helps.

Flex printing on OSX pushes image off the page. How can this be fixed?

My Flex 3 app prints pages just fine from browsers on Windows using FlexPrintJob (not the browser print function). However, on OSX, the left and top margins show up larger and the page gets pushed off the right and bottom. Basically, the scaling is screwed up, and I can't see any way to adjust the margins in code.
Has anyone seen this discrepancy in Flex printing between Windows and OSX? Are there any known workarounds? I've searched all over and I can't find any good info on this (other than 12 unresolved printing bugs in the Adobe Jira DB).
And please don't say "don't print in Flex". I know Flex sucks at printing, but I have to use it. Thanks!
Edit:
PDF Generation is one route and while its a valid solution for some folks, I need to print directly. I'd like to see stuff like using regular Flash PrintJob, monkeypatches to FlexPrintJob, or just ways I can format my DisplayObjects before sending them to FlexPrintJob. None of the scaling options in FlexPrintJob work. My Flex Component is at 1.0 scale. I'm not sure what else I can do except for mess around with regular PrintJob. I'm putting a bounty on this for answers in this domain.
Switch to PDF generation. There are two ways to do this without having to purchase server-side licenses:
Use our library of Flex components - clear.swc, a part of open source Clear Toolkit available on Sourceforge. This process is described in Ch. 11 of the book Enterprise Development with Flex currently available as rough cuts on safaribooksonline.com
Use open-source library alivePDF.
Don't print by Flex PrintJob :)

What happens when a Flex App can't run at the specified framerate?

In our application (a game), in some cases it can't run fast enough. Obviously we'd like to speed it up, but in the mean-time when this happens it causes many problems (or if it's not causing them, the two are related).
The one which is least related to our own functionality is that the built in Alert.show() method stops working. Typically the full-screen transparent box appears but not the actual popup. I believe this is down to Flex giving all available cycles to other tasks... but it's proving difficult to investigate analytically so I am happy to hear another explanation.
To clarify, core parts of Flex are simply not working in this situation. I've stepped through the code for instance where a new element is added to the screen, everything happens and the addChild() method is called on the main display canvas... but then the element does not appear. If we then disable our update loop, the element suddenly appears.
So whether Flex is supposed to run the exact same code or not, somehow it IS blocking is some strange way. As I said, even the Flex Alert.show() method doesn't work.
All Flash content is executed frame-by-frame - Flash executes one frame's worth of code, then updates the screen, and then waits until the next frame update.
When Flash can't keep up with the specified framerate, all that happens is that instead of waiting between frame updates, Flash does them as fast as it can with no waiting in between. So the only visible difference is that frame updates occur less frequently. There are never cases where code is skipped, events are dropped, or screen redraws are skipped for performance reasons (unless you've found new bugs).
So the most likely culprit is that either you have a problem with code that's very time-dependent (such as code that expects two timers to trigger on the same frame), or some other problem that's being misdiagnosed. (For example, maybe there's a bug causing a slowdown, rather than a slowdown causing your bug.)
I'm not too sure if Flex has some additional performance handling of it's own. But for pure actionscript the only thing that would happen is the framerate would slow to a crawl, everything will happen normally just slower. If you stack very large amounts of transparent or masked objects you might get some weird behavior, but that should be more noticable.
And I guess telling you that making a game in Flex isn't that much of a good idea (just because of the performance overhead the framework has) is a bit late ;)
I like to make games in FLEX 3 (actionscript3), its actually pretty handy solution when compared to Flash CS3: good debugging environment without hassle.
Of course it depends on the game style which one is better, if you need lot of graphics you may like Flash more, but Flex allows you to use external images, components, etc. Notice I am not talking about Flex XML project here.
Answer to your performance issue: You can use e.g. old MacOSX machine to see what happens in a very slow machine, a few solutions are:
- move objects more than x++ y++ pixels when machine is old
- reduce objects
you can detect with a timer how slow machine is..

Resources