Actionscript PNGEncoder performance and UI blocking - apache-flex

I'm trying to use PNGEncoder to encode a bitmapData object into a png ByteArray so I can send the data to the server. Everything would be peachy except the bitmapData is 4000x4000px and when I run the PNGEncoder.encode function on it the whole app stops (UI is blocked) for 5-8 seconds while it runs. Does anybody have any suggestions on how to not make it block so bad, I read about chunking up the process (since you can't multithread in AS3) but can't find any sample code on chunking up the process.
Thanks,
Sam

In addition to Arthur's comment, you could also write it in C/C++ for Alchemy, since alchemy supports green threads. Like PixelBender, Alchemy also requires Flash 10.

There are mainly two ways to do this.
a) Use pixel bender:
You can off load the work to pixel bender (a shade like language in as3). This has the advantage of using the gpu on some cases, but it also is assynchronous and non blocking (runs on another thread). But it does require player 10+. I haven't seen a pixel bender png encoder, and to be honest, it may not be possible (I am not familiar enough with png encoding to tell), but it might be an option. This is, performance wise, the best you can get. More info here
b) Use chuncking. Basically, you rewrite the encoder to encode blocks (lines, columns or a smaller area), and hook that to an enter frame event, each frame you'd call next on your encoder, until there is no more encoding to do. Zeh has a neat LWZ chunked encoder with source code that might give you insights into the details.
Cheers
Arthur

Another shameless plug!
You can use my recently completed PNGEncoder2 library (also requires Flash 10+), which handily supports gigantic images. It does proper asynchronous encoding, with no single compression step at the end. Additionally, it's really fast ;-)
Grab it from GitHub (README), and check out the benchmark comparing it with other encoders on my blog post.
It's highly tuned for speed, and uses the Alchemy opcodes and domain memory to speed it up (thanks to Haxe), so it should be comparable to anything you compile using Alchemy.

You could encode multiple PNG files separately and send them to the server. Once on the server you can reconstruct the larger image.

It's for JPEG encoding, but should be useful - look a this post http://segfaultlabs.com/blog/post/asynchronous-jpeg-encoding/

As Arthur Debert said, you can use chunking. I'd suggest that instead of encoding once/frame, you try a setTimeout( chunkingFunction, 0 ); approach. A timeout with a 0 ms delay will happen as soon as possible, allowing the chunking to process quickly but without crushing the UI.

Related

QR Code Recognition in AGV (Auto Guided Vehicle)

I have some questions.
The first question is which equipment should be used to recognize QR Code.
I'm thinking of two things.
The first is the QR code Scanner used in the industrial field.
The second is the camera module. (opencv will be used)
However, the situation to consider is that it should be recognized at the speed of 50cm/s.
What do you think about?
And if I use a camera, is there a library that you can recommend to recognize QR Code? (C/C++ only)
Always start with the simplest solution and then go more complex if needed. If you're using ROS/OpenCV, OpenCV has a QR Code scanner, ex. Other options include ZBar, quirc, and more, found by searching github or the internet.
As for a camera, if you don't need the intrinsic matrix, then you only need to decide on the resolution: more resolution takes (non-linearly) longer to compute, but less resolution prohibits seeing the objects well.
Your comment about "recognize at 50cm/s" doesn't make much sense. I assume you mean that you want to be able to decode a QR code that's up-to 50 cm away, and do it in less than a second (to have time to stop). First you'll have to check if the algorithm, running on your hardware, can detect the QR code at different desired distances, and how that changes with scaling the image up/down in OpenCV. Then you'll have to time how long it takes to detect/decode it at those distances/resolutions/scales. If it fails to be good enough, you can try another algorithm, try different compilation settings, perhaps give it it's own thread, change the scaling on the image, accept the limitations, or change the hardware.

Render YUV in JavaFX

I need to render a yuyv422 stream in JavaFX with minimum latency. If I convert it to RGB, I can use an ImageView with a WritableImage with a PixelFormat instance, and it works, but the RGB conversion consumes a lot of CPU, specially with high resolutions. I saw this exact feature request
https://bugs.openjdk.java.net/browse/JDK-8091933
but seems it will not be implemented in Java 9. And if it does, I wonder if it won't introduce latency or demand too much CPU. Is there another way using JavaFX?
In General:
Image processing is always expensive, this is why Vectorization or Hardware Acceleration is used for these tasks. Simple looping through an Image with just one thread is already really slow, especially in java. On top of that people tend to use Color objects for color modifications which is tremendously slow.
Pure Java:
If you want to keep your code in pure Java. You should check which internal format is used for the WriteableImage by calling:
myImage.getPixelWriter().getPixelFormat().getType()
If the internal format isn't RGB adapt your color conversion to the given format to avoid double conversion.
Additionally make sure that your code is optimized as much as possible:
-Don't use any objects except arrays
-Minimize the use of local variables
You can also try to multithread the conversion process via parallel loops.
JNI:
Moving away from Java opens up a lot of possibilities. There are several platform independent libraries for converting YUV to RGB and back:
OpenCV:
Easy to use and coming already with an java API:
byte[] myYuvImage = null; //your image here
byte[] myRgbImage = new byte[width * height * 3]; //the output image
Mat yuvMat = new Mat(height, width, CvType.CV_8UC2); //YUV422 should be 2 channel
Mat rgbMat = new Mat(height, width, CvType.CV_8UC3);
yuvMat.put(0,0, myYuvImage);
Imgproc.cvtColor(yuvMat, rgbMat, Imgproc.COLOR_YUV2RGB_Y422);
rgbMat.get(0, 0, myRgbImage);
Intel IPP:
Only available via JNI. You would use ippiRGBToYUV422_8u_C3C2R see RGBToYUV422 for more information.
SwScale as part of FFmpeg:
Only available via JNI. See this answer and adapt the example.
My personal experience is that IPP offers by far the best performance even on AMD machines. However the license it comes with may be free but it prohibits decompiling which might be an not compatible with LGPL libraries.

Store a video in a SQLite database?

I'm working on an algorithm which needs very fast random access to video frames in a possibly long video (minimum 30 minutes). I am currently using OpenCV's VideoCapture to read my video, but the seeking functionality is either broken or very slow. The best I found until now is using the MJPEG codec inside a MKV container, but it's not fast enough.
I can chose any video format or even create a new one. The storage space is not a problem (to some extents of course). The only requirement is to get the fastest possible seeking time to any location in the video. Ideally, I would like to be able to access to multiple frames simultaneously, taking advantages of my quad-core CPU.
I know that relational databases are very good to store large volumes of data, they allows simultaneous read accesses and they're very fast when using indexes.
Is SQLite a good fit for my specific needs ? I plan to store each video frame compressed in JPEG, and use an index on the frame number to access them quickly.
EDIT : for me a frame is just an image, not the entire video. A 30mn video # 25 fps contains 30*60*25=45000 frames, and I want to be able to quickly get one of them using its number.
EDIT : For those who could be interested, I finally implemented a custom video container saving each frame in fixed-sized blocks (consequently, the position of any frame can be directly computed !). The images are compressed with the turbojpeg library and file accesses are multi-threaded (to be NCQ-friendly). The bottleneck is not the HDD anymore and I finally obtained much better perfs :)
I don't think using SQLite (or any other dabatase engine) is a good solution for your problem. A database is not a filesystem.
If what you need is very fast random access, then stick to the filesystem, it was designed for this kind of usage, and optimized with this in mind. As per your comment, you say a 5h video would require 450k files, well, that's not a problem in my opinion. Certainly, directory listing will be a bit long, but you will get the absolute fastest possible random access. And it will certainly be faster than SQLite because you're one level of abstraction under.
And if you're really worried about directory listing times, you just have to organize your folder structure like a tree. That will get you longer paths, but fast listing.
Keep a high level perspective. The problem is that OpenCV isn't fast enough at seeking in the source video. This could be because
Codecs are not OpenCV's strength
The source video is not encoded for efficient seeking
You machine has a lot of dedicated graphics hardware to leverage, but it does not have specialized capabilities for randomly seeking within a 17 GB dataset, be it a file, a database, or a set of files. The disk will take a few milliseconds per seek. It will be better for an SSD but still not so great. Then you wait for it to load into main memory And you have to generate all that data in the first place.
Use ffmpeg, which should handle decoding very efficiently, perhaps even using the GPU. Here is a tutorial. (Disclaimer, I haven't used it myself.)
You might preprocess the video to add key frames. In principle this shouldn't require completely re-encoding, at least for MPEG, but I don't know much about specifics. MJPEG essentially turns all frames into keyframes, but you can find a middle ground and maybe seek 1.5x faster at a 2x size cost. But avoid hitting the disk.
As for SQLite, that is a fine solution to the problem of seeking within 17 GB of data. The notion that databases aren't optimized for random access is poppycock. Of course they are. A filesystem is a kind of database. Random access in 17 GB is slow because of hardware, not software.
I would recommend against using the filesystem for this task, because it's a shared resource synchronized with the rest of the machine. Also, creating half a million files (and deleting them when finished) will take a long time. That is not what a filesystem is specialized for. You can get around that, though, by storing several images to each file. But then you need some format to find the desired image, and then why not put them all in one file?
Indeed, (if going the 17 GB route) why not ignore the entire problem and put everything in virtual memory? VM is just as good at making the disk seek as SQLite or the filesystem. As long as the OS knows it's OK for the process to use that much memory, and you're using 64-bit pointers, it should be a fine solution, and the first thing to try.

Using DirectShow filters outside DirectShow?

I'm currently dealing with Windows Media Foundation. However, due to some problems with the Microsoft H.264 decoder and some missing decoders for custom format, I'd like to know if it would be possible to instantiate a DirectShow Decoder directly using CLSID, and build a proxy around it that exposes IMFTransform to get a decoder for Media Foundation. So here is my question:
Can i instantiate a Directshow filter (preferrably decoders) directly and use them for decoding (i.e. put some compressed frames and get uncompressed ones) to create a MFT?
I know how to instantiate the filter itself using its CLSID. However, I have no clue how to use the actual decoding functionality.
Any ideas, hints, links whatever will be appreciated. Thanks,
J.
(disclaimer: I have never actually done this, but I see no technical reason it cannot be done. So YMMV)
If the decoder is a DMO filter, then it'll be a lot easier--you can talk to it through IMediaObject. This isn't really much different from how DirectShow uses DMOs; it simply wraps the DMO with another transform filter that handles the media type negotiation and sample passing, but there's nothing really stopping you from doing this in your own application.
There's one catch: for IMediaObject::ProcessInput and IMediaObject::ProcessOutput, you'll need your own buffer class that implements IMediaBuffer. But it's a pretty basic interface, so I don't think you'll have too much trouble implementing it. Here's a basic implementation.
For regular directshow filters, it's actually going to be a lot more difficult, because most DirectShow filters really depend on there being an external graph available (case in point: all of the directshow eventing sort of presumes the existence of this graph). If you really want to use a single DShow filter standalone, you'd probably have to wrap the entire filter graph, and then have a custom source filter to feed samples in. You could use the sample grabber (or a custom render filter) to yank samples out of the graph and expose to the rest of the application. (one sort of crazy idea would even be to wrap this graph in a DMO filter implementation, and then use IMediaObject to talk to it--this might be tricky, however)
Luckily most decoders tend to be implemented as DMO filters, so I think there's a strong likelihood that you can just use IMediaObject.
I'm unsure as to why you would want to do this. You don't really want a filter living outside of a graph.
If you don't want to use the traditional file / network source filters, or the traditional renderers, you can write buffer renderers, and buffer source filters, that you pass pointers to, and get pointers from. Then you can drop the whole mess into a graph and run it, and get the use of the decoder pretty much directly without anything else. This wouldn't be difficult to do. The decoder is probably expecting a PES packet stream though.

Implement IP camera

We have a device that has an analog camera. We have a card that samples it and digitizes it. This is all done in directx. At this point in time, replacing hardware is not an option, but we need to code such that we can see this video feed real-time regardless of any hardware or underlying operating system changes occur in the future.
Along this line, we've chosen Qt to implement a GUI to view this camera feed. However, if we move to a linux or other embedded platform in the future and change other hardware (including the physical device where the camera/video sampler lives), we will need to change the camera display software as well, and that's going to be a pain because we need to integrate it into our GUI.
What i proposed was migrating to a more abstract model where data is sent over a socket to the GUI and the video is displayed live after being parsed from the socket stream.
First, is this a good idea or a bad idea?
Secondly, how would you implement such a thing? How do the video samplers usually give usable output? How can I push this output over a socket? Once I am on the receiving end parsing the output, how do I know what to do with the output (as in how to get the output to render)? The only thing I can think of would be to write each sample to a file and then to display the contents of the file every time a new sample arrives. This seems like an inefficient solution to me, if it would work at all.
How do you recommend I handle this? Are there any cross-platform libraries available for such a thing?
Thank you.
edit: i am willing to accept suggestions of something different rather than what is listed above.
Have you looked at QVision? It is a Qt based framework for managing video and video processing. You don't need the processing, but I think it will do what you want.
Anything that duplicates the video stream is going to cost you in performance, especially in an embedded space. In most situations for video, I think you're better off trying to use local hardware acceleration to blast the video directly to the screen. With some proper encapsulation, you should be able to use Qt for the GUI surrounding the video, and have a class that is platform specific that you use to control the actual video drawing to the screen (where to draw, and how big, etc.).
Edit:
You may also want to look at the Phonon library. I haven't looked at it much, but it appears to support showing video that may be acquired from a range of different sources.

Resources