Using DirectShow filters outside DirectShow? - directshow

I'm currently dealing with Windows Media Foundation. However, due to some problems with the Microsoft H.264 decoder and some missing decoders for custom format, I'd like to know if it would be possible to instantiate a DirectShow Decoder directly using CLSID, and build a proxy around it that exposes IMFTransform to get a decoder for Media Foundation. So here is my question:
Can i instantiate a Directshow filter (preferrably decoders) directly and use them for decoding (i.e. put some compressed frames and get uncompressed ones) to create a MFT?
I know how to instantiate the filter itself using its CLSID. However, I have no clue how to use the actual decoding functionality.
Any ideas, hints, links whatever will be appreciated. Thanks,
J.

(disclaimer: I have never actually done this, but I see no technical reason it cannot be done. So YMMV)
If the decoder is a DMO filter, then it'll be a lot easier--you can talk to it through IMediaObject. This isn't really much different from how DirectShow uses DMOs; it simply wraps the DMO with another transform filter that handles the media type negotiation and sample passing, but there's nothing really stopping you from doing this in your own application.
There's one catch: for IMediaObject::ProcessInput and IMediaObject::ProcessOutput, you'll need your own buffer class that implements IMediaBuffer. But it's a pretty basic interface, so I don't think you'll have too much trouble implementing it. Here's a basic implementation.
For regular directshow filters, it's actually going to be a lot more difficult, because most DirectShow filters really depend on there being an external graph available (case in point: all of the directshow eventing sort of presumes the existence of this graph). If you really want to use a single DShow filter standalone, you'd probably have to wrap the entire filter graph, and then have a custom source filter to feed samples in. You could use the sample grabber (or a custom render filter) to yank samples out of the graph and expose to the rest of the application. (one sort of crazy idea would even be to wrap this graph in a DMO filter implementation, and then use IMediaObject to talk to it--this might be tricky, however)
Luckily most decoders tend to be implemented as DMO filters, so I think there's a strong likelihood that you can just use IMediaObject.

I'm unsure as to why you would want to do this. You don't really want a filter living outside of a graph.
If you don't want to use the traditional file / network source filters, or the traditional renderers, you can write buffer renderers, and buffer source filters, that you pass pointers to, and get pointers from. Then you can drop the whole mess into a graph and run it, and get the use of the decoder pretty much directly without anything else. This wouldn't be difficult to do. The decoder is probably expecting a PES packet stream though.

Related

In a paragraph or less, what is the purpose and benefits of pointers?

See title. That's all I have to ask. The net doesn't have many succinct answers to this question. Please keep in mind stack vs heap. Explain as you would to a complete beginner. Just looking for the "why" not the "how".
edit
Are pointers a way to get large objects out of the stack?
When passing a huge object from one piece of your program to another to be worked on like an entire class for example or something with a large amount of data like an image or video passing every single bit of data would be very inefficient. Instead you can just pass a tiny little memory address (pointer) that the receiving part of your program can then use to get to the object to be worked on.
Aside from that huge aspect, they offer a lot of flexibility but I need more than a paragraph for that.
When you get into managed code like C# or Java EVERYTHING is done with pointers/references but it's all behind the scenes and you don't have to deal with them like you would in C++ or another similar language. But it's still crucial to understand how they work.
Edit in response to:
"why would I pass a large object around if I don't need to work on
it?"
You wouldn't. However; Correct me if I'm straying from what your asking but what you'll learn if you continue into Computer Science is that a piece of your program should be as simple as possible it should only do 1 thing. Commonly known as the Single Responsibility Principle this dictates that you will have many seemingly tiny parts of your program that will all work together to accomplish the over arching goal. That means that a lot of those tiny pieces are going to need to work on the same objects, the same data and use the same tools to get the job done. Lets look at a hypothetical.
You're coding a simple image editing application.You're going to need a cropping tool, a paint brush tool, a selection tool, and a re-size tool. Each of these tools are going to need their own place in your program (a class or more likely many classes that work together) and that class will have many smaller pieces (methods/functions and other things) that work together to accomplish the goal of that class. Every single one of these classes and methods is most likely going to need to look at or modify the image data. With a pointer you can provide them with a memory address instead of making an entire copy of the image. That way when one of the classes or methods makes a change to it you don't need to worry about managing all these copies and making sure they all get the same change.
It allows you to do pass-by-reference/shared data structures, which has two big features: it saves memory and CPU overhead by not making copies, and it provides for complex communication patterns by making changes to shared data.

How to use non-blocking or asynchronous IO with Boost Spirit?

Does Spirit provide any capabilities for working with non-blocking IO?
To provide a more concrete example: I'd like to use Boost's Spirit parsing framework to parse data coming in from a network socket that's been placed in non-blocking mode. If the data is not completely available, I'd like to be able to use that thread to perform other work instead of blocking.
The trivial answer is to simply read all the data before invoking Spirit, but potentially gigabytes of data would need to be received and parsed from the socket.
It seems like that in order to support non-blocking I/O while parsing, Spirit would need some ability to partially parse the data and be able to pause and save its parse state when no more data is available. Additionally, it would need to be able to resume parsing from the saved parse state when data does become available. Or maybe I'm making this too complicated?
TODO Will post a example for a simple single-threaded 'event-based' parsing model. This is largely trivial but might just be what you need.
For anything less trivial, please heed to following considerations/hints/tips:
How would you be consuming the result? You wouldn't have the synthesized attributes any earlier anyway, or are you intending to use semantic actions on the fly?
That doesn't usually work well due to backtracking. The caveats could be worked around by careful and judicious use of qi::hold, qi::locals and putting semantic actions with side-effects only at stations that will never be backtracked. In other words:
this is bound to be very errorprone
this naturally applies to a limited set of grammars only (those grammars with rich contextual information will not lend themselves well for this treatment).
Now, everything can be forced, of course, but in general, experienced programmers should have learned to avoid swimming upstream.
Now, if you still want to do this:
You should be able to get spirit library thread safe / reentrant by defining BOOST_SPIRIT_THREADSAFE and linking to libboost_thread. Note this makes the gobals used by Spirit threadsafe (at the cost of fine grained locking) but not your parsers: you can't share your own parsers/rules/sub grammars/expressions across threads. In fact, you can only share you own (Phoenix/Fusion) functors iff they are threadsafe, and any other extensions defined outside the core Spirit library should be audited for thread-safety.
If you manage the above, I think by far the best approach would seem to
use boost::spirit::istream_iterator (or, for binary/raw character streams I'd prefer to define a similar boost::spirit::istreambuf_iterator using the boost::spirit::multi_pass<> template class) to consume the input. Note that depending on your grammar, quite a bit of memory could be used for buffering and the performance is suboptimal
run the parser on it's own thread (or logical thread, e.g. Boost Asio 'strands' or its famous 'stackless coprocedures')
use coarse-grained semantic actions like shown above to pass messages to another logical thread that does the actual processing.
Some more loose pointers:
you can easily 'fuse' some functions to handle lazy evaluation of your semantic action handlers using BOOST_FUSION_ADAPT_FUNCTION and friends; This reduces the amount of cruft you have to write to get simple things working like normal C++ overload resolution in semantic actions - especially when you're not using C++0X and BOOST_RESULT_OF_USE_DECLTYPE
Because you will want to avoid semantic actions with side-effects, you should probably look at Inherited Attributes and qi::locals<> to coordinate state across rules in 'pure functional fashion'.

Implement IP camera

We have a device that has an analog camera. We have a card that samples it and digitizes it. This is all done in directx. At this point in time, replacing hardware is not an option, but we need to code such that we can see this video feed real-time regardless of any hardware or underlying operating system changes occur in the future.
Along this line, we've chosen Qt to implement a GUI to view this camera feed. However, if we move to a linux or other embedded platform in the future and change other hardware (including the physical device where the camera/video sampler lives), we will need to change the camera display software as well, and that's going to be a pain because we need to integrate it into our GUI.
What i proposed was migrating to a more abstract model where data is sent over a socket to the GUI and the video is displayed live after being parsed from the socket stream.
First, is this a good idea or a bad idea?
Secondly, how would you implement such a thing? How do the video samplers usually give usable output? How can I push this output over a socket? Once I am on the receiving end parsing the output, how do I know what to do with the output (as in how to get the output to render)? The only thing I can think of would be to write each sample to a file and then to display the contents of the file every time a new sample arrives. This seems like an inefficient solution to me, if it would work at all.
How do you recommend I handle this? Are there any cross-platform libraries available for such a thing?
Thank you.
edit: i am willing to accept suggestions of something different rather than what is listed above.
Have you looked at QVision? It is a Qt based framework for managing video and video processing. You don't need the processing, but I think it will do what you want.
Anything that duplicates the video stream is going to cost you in performance, especially in an embedded space. In most situations for video, I think you're better off trying to use local hardware acceleration to blast the video directly to the screen. With some proper encapsulation, you should be able to use Qt for the GUI surrounding the video, and have a class that is platform specific that you use to control the actual video drawing to the screen (where to draw, and how big, etc.).
Edit:
You may also want to look at the Phonon library. I haven't looked at it much, but it appears to support showing video that may be acquired from a range of different sources.

Actionscript PNGEncoder performance and UI blocking

I'm trying to use PNGEncoder to encode a bitmapData object into a png ByteArray so I can send the data to the server. Everything would be peachy except the bitmapData is 4000x4000px and when I run the PNGEncoder.encode function on it the whole app stops (UI is blocked) for 5-8 seconds while it runs. Does anybody have any suggestions on how to not make it block so bad, I read about chunking up the process (since you can't multithread in AS3) but can't find any sample code on chunking up the process.
Thanks,
Sam
In addition to Arthur's comment, you could also write it in C/C++ for Alchemy, since alchemy supports green threads. Like PixelBender, Alchemy also requires Flash 10.
There are mainly two ways to do this.
a) Use pixel bender:
You can off load the work to pixel bender (a shade like language in as3). This has the advantage of using the gpu on some cases, but it also is assynchronous and non blocking (runs on another thread). But it does require player 10+. I haven't seen a pixel bender png encoder, and to be honest, it may not be possible (I am not familiar enough with png encoding to tell), but it might be an option. This is, performance wise, the best you can get. More info here
b) Use chuncking. Basically, you rewrite the encoder to encode blocks (lines, columns or a smaller area), and hook that to an enter frame event, each frame you'd call next on your encoder, until there is no more encoding to do. Zeh has a neat LWZ chunked encoder with source code that might give you insights into the details.
Cheers
Arthur
Another shameless plug!
You can use my recently completed PNGEncoder2 library (also requires Flash 10+), which handily supports gigantic images. It does proper asynchronous encoding, with no single compression step at the end. Additionally, it's really fast ;-)
Grab it from GitHub (README), and check out the benchmark comparing it with other encoders on my blog post.
It's highly tuned for speed, and uses the Alchemy opcodes and domain memory to speed it up (thanks to Haxe), so it should be comparable to anything you compile using Alchemy.
You could encode multiple PNG files separately and send them to the server. Once on the server you can reconstruct the larger image.
It's for JPEG encoding, but should be useful - look a this post http://segfaultlabs.com/blog/post/asynchronous-jpeg-encoding/
As Arthur Debert said, you can use chunking. I'd suggest that instead of encoding once/frame, you try a setTimeout( chunkingFunction, 0 ); approach. A timeout with a 0 ms delay will happen as soon as possible, allowing the chunking to process quickly but without crushing the UI.

Use-cases for reflection

Recently I was talking to a co-worker about C++ and lamented that there was no way to take a string with the name of a class field and extract the field with that name; in other words, it lacks reflection. He gave me a baffled look and asked when anyone would ever need to do such a thing.
Off the top of my head I didn't have a good answer for him, other than "hey, I need to do it right now". So I sat down and came up with a list of some of the things I've actually done with reflection in various languages. Unfortunately, most of my examples come from my web programming in Python, and I was hoping that the people here would have more examples. Here's the list I came up with:
Given a config file with lines like
x = "Hello World!"
y = 5.0
dynamically set the fields of some config object equal to the values in that file. (This was what I wished I could do in C++, but actually couldn't do.)
When sorting a list of objects, sort based on an arbitrary attribute given that attribute's name from a config file or web request.
When writing software that uses a network protocol, reflection lets you call methods based on string values from that protocol. For example, I wrote an IRC bot that would translate
!some_command arg1 arg2
into a method call actions.some_command(arg1, arg2) and print whatever that function returned back to the IRC channel.
When using Python's __getattr__ function (which is sort of like method_missing in Ruby/Smalltalk) I was working with a class with a whole lot of statistics, such as late_total. For every statistic, I wanted to be able to add _percent to get that statistic as a percentage of the total things I was counting (for example, stats.late_total_percent). Reflection made this very easy.
So can anyone here give any examples from their own programming experiences of times when reflection has been helpful? The next time a co-worker asks me why I'd "ever want to do something like that" I'd like to be more prepared.
I can list following usage for reflection:
Late binding
Security (introspect code for security reasons)
Code analysis
Dynamic typing (duck typing is not possible without reflection)
Metaprogramming
Some real-world usages of reflection from my personal experience:
Developed plugin system based on reflection
Used aspect-oriented programming model
Performed static code analysis
Used various Dependency Injection frameworks
...
Reflection is good thing :)
I've used reflection to get current method information for exceptions, logging, etc.
string src = MethodInfo.GetCurrentMethod().ToString();
string msg = "Big Mistake";
Exception newEx = new Exception(msg, ex);
newEx.Source = src;
instead of
string src = "MyMethod";
string msg = "Big MistakeA";
Exception newEx = new Exception(msg, ex);
newEx.Source = src;
It's just easier for copy/paste inheritance and code generation.
I'm in a situation now where I have a stream of XML coming in over the wire and I need to instantiate an Entity object that will populate itself from elements in the stream. It's easier to use reflection to figure out which Entity object can handle which XML element than to write a gigantic, maintenance-nightmare conditional statement. There's clearly a dependency between the XML schema and how I structure and name my objects, but I control both so it's not a big problem.
There are lot's of times you want to dynamically instantiate and work with objects where the type isn't known until runtime. For example with OR-mappers or in a plugin architecture. Mocking frameworks use it, if you want to write a logging-library and dynamically want to examine type and properties of exceptions.
If I think a bit longer I can probably come up with more examples.
I find reflection very useful if the input data (like xml) has a complex structure which is easily mapped to object-instances or i need some kind of "is a" relationship between the instances.
As reflection is relatively easy in java, I sometimes use it for simple data (key-value maps) where I have a small fixed set of keys. One one hand it's simple to determine if a key is valid (if the class has a setter setKey(String data)), on the other hand i can change the type of the (textual) input data and hide the transformation (e.g simple cast to int in getKey()), so the rest of the application can rely on correctly typed data.
If the type of some key-value-pair changes for one object (e.g. form int to float), i only have to change it in the data-object and its users but don't have to keep in mind to check the parser too. This might not be a sensible approach, if performance is an issue...
Writing dispatchers. Twisted uses python's reflective capabilities to dispatch XML-RPC and SOAP calls. RMI uses Java's reflection api for dispatch.
Command line parsing. Building up a config object based on the command line parameters that are passed in.
When writing unit tests, it can be helpful to use reflection, though mostly I've used this to bypass access modifiers (Java).
I've used reflection in C# when there was some internal or private method in the framework or a third party library that I wanted to access.
(Disclaimer: It's not necessarily a best-practice because private and internal methods may be changed in later versions. But it worked for what I needed.)
Well, in statically-typed languages, you'd want to use reflection any time you need to do something "dynamic". It comes in handy for tooling purposes (scanning the members of an object). In Java it's used in JMX and dynamic proxies quite a bit. And there are tons of one-off cases where it's really the only way to go (pretty much anytime you need to do something the compiler won't let you do).
I generally use reflection for debugging. Reflection can more easily and more accurately display the objects within the system than an assortment of print statements. In many languages that have first-class functions, you can even invoke the functions of the object without writing special code.
There is, however, a way to do what you want(ed). Use a hashtable. Store the fields keyed against the field name.
If you really wanted to, you could then create standard Get/Set functions, or create macros that do it on the fly. #define GetX() Get("X") sort of thing.
You could even implement your own imperfect reflection that way.
For the advanced user, if you can compile the code, it may be possible to enable debug output generation and use that to perform reflection.

Resources