How to render a byte array from socket/application using DirectShow? - directshow

I have an application. I will have a situation, wherein I will receive a big array of encoded bytes. I have to decode them and render it. For decoding, I am using a custom decoder class. After the decode, how can I construct a DirectShow graph which will receive input data from the decoder? Please give some direction/samples on this.

Have a look at the PushSource sample in the DirectShow SDK. This sample shows you how to create a source filter that can be rendered. It is all about setting the output media type of your filter correctly so that the rest of the graph can be rendered. The sample also shows you how to feed media samples to the rest of the media pipeline. In your case what do you decode to? The PushSource sample outputs RGB24 IIRC.
Also, it sounds like you're decoding in the same filter as your receiving the bytes in? Typically in DirectShow you would write a source filter that is able to receive bytes from the network and outputs samples in the encoded format. You would then connect this filter to a custom decoder filter, that then outputs either RGB24 or some raw media format that is understood by DirectShow. Similarly for audio, you could output say, PCM.
Edit:
I have used the same approach (CSource, CSourceStream). That is correct, the DoBufferProcessingLoop calls FillBuffer. My general approach has been to use the producer-consumer pattern. The networking-reading thread populates the queue with samples and in my overridden DoBufferProcessingLoop I check whether the queue has any data, calling FillBuffer if there is data. You can of course try other methods such as waiting on events (frame availibility). To see the approach I used you can download the source code of an example RTSP source filter at http://sourceforge.net/projects/videoprocessing/ and see if that suits you. Best thing I would say is to just try stuff and learn as you go along.

Related

Muxing non-synchronised streams to Haali

I have 2 input streams of data that are being passed to a Haali Muxer (mp4 format).
Currently I stream these to Haali directly in a DirectShow graph without a clock. I wondered if I should be trying to write these to the muxer synchronised, or whether it happily accepts a stream of audio data that stops before the video data stream stops. (I have issues with the output file not playing audio after seeking, and I'm not sure why this could occur)
I can't find much in the way of documentation for muxing with the Haali muxer, does anyone know the best place to look for info on this filter?
To have the streams multiplexed into single MP4 file you need single instance of multiplexer (Haali, GDCL, commercial, wrapper over mp4v2 library, over Media Foundation sink etc) with two (or more) input pins on it connected to respective sources, which in turn are going to be written as tracks.
Filter graph clock does not matter. Clock is for presentation, and file writers accept incoming data and write it as soon as possible anyway. It is more accurate to remove the clock, as you seem to already be doing, but having standard clock is not going to be different.
Data is synchronized using time stamps on individual media samples, parts of media streams. Multiplexer builds internal queues for every stream and then consumes data from the streams to build single file, in a sort of way that original stream data is interleaved. If one stream supplies too much data, that is, if data is available too early while another stream supplies data slowly, multiplexer blocks further data reception on this particular stream by not returning from respective processing call (IPin::Receive) expecting that during this wait the slow stream provides additional input. Eventually, what multiplexer looks at when matching data from different streams is data time stamps.
To obtain synchronized data in resulting MP4 file you, thus, need to make sure the payload data is properly time stamped. Multiplexer will take care of the rest.
This also includes that the time stamps should be monotonously increasing within a stream, and key frames/splice points are respectively indicated. Otherwise some multiplexers might issue a failure immediately, other would produce the output file but it might have playback issues (esp. seeking).

my YUY2 output does not work with Video Renderer filter

I have a basic avstream driver (based on the avshws sample).
When testing the YUY2 output I get different results based on which renderer I use:
Video Renderer: blank image
VMR-7: scrambled image (due to the renderer using a buffer with a larger stride)
VMR-9: perfect render
I dont know why the basic video renderer (used by amcap) wont work. I have examined the graph of a webcam outputting the same format and I cannot see any difference apart from the renderer output.
I'm assuming you're writing your own filter based on avshws. I'm not familiar with this particular sample but generally you need to ensure two things:
Make sure your filter checks any media types proposed are acceptable. In the DirectShow baseclasses the video renderer calls the output pin IPin::QueryAccept which calls whichever base class member you're using e.g. CBasePin.CheckMediaType or CTransformFilter.CheckTransform
Make sure you call IMediaSample::GetMediaType on every output sample and respond appropriately e.g. calling CTransformFilter.SetMediaType and changing the format/stride of your output. It's too late to negotiate at this point - you've accepted the change already and if you really can't continue you have to abort streaming, return an error HRESULT and notify EC_ERRORABORT or EC_ERRORABORTEX. Unless it's buggy the downstream filter should have called your output pin's QueryAccept and received S_OK before it sends a sample with a media type change attached (I've seen occasional filters that add duplicate media types to the first sample without asking).
See Handling Format Changes from the Video Renderer
I have figured out the problem. I was missing one line to update the remaining bytes in the stream pointer structure:
Leading->OffsetOut.Remaining = 0;
This caused certain filters to drop my samples (AVI/MJPEG Decompressor, Dump) which meant that certain graph configurations would simply not render anything.

Using multiple QR codes to encode a binary image

I'm increasingly looking at using QR codes to transmit binary information, such as images, since it seems whenever I demo my app, it's happening in situations where the WiFi or 3G/4G just doesn't work.
I'm wondering if it's possible to split a binary file up into multiple parts to be encoded by a series of QR codes?
Would this be as simple as splitting up a text file, or would some sort of complex data coherency check be required?
Yes, you could convert any arbitrary file into a series of QR codes,
something like Books2Barcodes.
The standard way of encoding data too big to fit in one QR code is with the "Structured Append Feature" of the QR code standard.
Alas, I hear that most QR encoders or decoders -- such as zxing -- currently do not (yet) support generating or reading such a series of barcodes that use the structured append feature.
QR codes already have a pretty strong internal error correction.
If you are lucky, perhaps splitting up your file with the "split" utility
into pieces small enough to fit into a easily-readable QR code,
then later scanning them in (hopefully) the right order and using "cat" to re-assemble them,
might be adequate for your application.
You surely can store a lot of data in a QR code; it can store 2953 bytes of data, which is nearly twice the size of a standard TCP/IP packet originated on an Ethernet network, so it's pretty powerful.
You will need to define some header for each QR code that describes its position in the stream required to rebuild the data. It'll be something like filename chunk 12 of 96, though encoded in something better than plain text. (Eight bytes for filename, one byte each for chunk number and total number of chunks -- a maximum of 256 QR codes, one simple ten-byte answer, still leaving 2943 bytes per code.)
You will probably also want to use some form of forward error correction such as erasure codes to encode sufficient redundant data to allow for mis-reads of either individual QR codes or entire missing QR codes to be transparently handled well. While you may be able to take an existing library, such as for Reed-Solomon codes to provide the ability to fix mis-reads within a QR code, handling missing QR codes entirely may take significantly more effort on your part.
Using erasure codes will of course reduce the amount of data you can transmit -- instead of all 753,408 bytes (256 * 2943), you will only have 512k or 384k or even less available to your final images -- depending upon what code rate you choose.
I think it is theoretically possible and as simple as splitting up text file. However, you probably need to design some kind of header to know that the data is multi-part and to make sure different parts can be merged together correctly regardless of the order of scanning.
I am assuming that the QR reader library returns raw binary data, and you will you the job of converting it to whatever form you want.
If you want automated creation and transmission, see
gre/qrloop: Encode a big binary blob to a loop of QR codes
maxg0/displaysocket.js: DisplaySocket.js - a JavaScript library for sending data from one device to another via QR ocdes using only a display and a camera
Note - I haven't used either.
See also: How can I publish data from a private network without adding a bidirectional link to another network - Security StackExchange

Handling Dynamic Format Changes in DirectShow

I just have simple graph:
SourceFilter ---> CustomTransformFilter --> VideoRendererFilter
In my CustomTranformFilter i change video properties dynamically:i.e i rescale video into new dimensions.
Input Video[1024,720]-->|CustomTransformFilter|--->Output Video[640,480]
But my renderer see the video as still in its original size ( [1024,720] not rescaled [640,480] )
And i get corrupted images at video renderer:Since renderer try to draw new image based on old dimensions...
How can i fix it?
Best Wishes
Update:
As i understand from Davies answer :
Given: The graph is active, but the filters in question do not support dynamic
pin reconnections
And
Possible mechanisms for changing the format: (MSDN DirectShow Doc)
a. QueryAccept (Downstream)
b. QueryAccept (Upstream)
c. ReceiveConnection
Davies suggest ReceiveConnection.
ReceiveConnection:is used when an output pin proposes a format change to
its downstream peer, and the new format requires a larger buffer. ( MSDN DirectShow Doc).
The gmfbridge example is "too complex" for me to figure out how to use "ReceiveConnection".
I am novice at DirectShow.
Any one has simple code example that use ReceiveConnection mechanism to respond dynamic format change?
The normal way to do a dynamic type change in DirectShow is to attach a Media Type to the sample that you deliver. This won't work with the video renderer, since it is allocating the samples. You need to request a change in type before you get the sample from the allocator.
You do this using ReceiveConnection. You must make sure that there are no buffers outstanding on that allocator, and then you can call IPin::ReceiveConnection (without disconnecting first). There is an example of this in the gmfbridge code at www.gdcl.co.uk/gmfbridge, in BridgeSourceOutput::SwitchTo().
G

change recording file programmatically in directshow

I made a console application, using directshow, that record from a live source (now a webcam, then a tv capture card), add current date and time in overlay and then save audio and video as .asf.
Now I want that the output file is going to change every 60 minutes without stopping the graph. I must not loose any seconds of the live stream.
The graph is something like this one:
http://imageshack.us/photo/my-images/543/graphp.jpg/
I took a look at the GMFBridge but I have some compiling problem with their examples.
I am wondering if there is a way to split what exist from the overlay filter and audio source, connect them to another asf writer (paused) and then switch them every 60 minutes.
The paused asf filter's file name must change (pp.asf, pp2.asf, pp4.asf ...). Something like this:
http://imageshack.us/photo/my-images/546/graph1f.jpg/
with pp1 paused. I found some people in internet that say that the asf writer deletes the current file if the graph does not go in stop mode.
Well, I have the product (http://www.videophill.com) that does exactly what you described (its used for broadcast compliance recording purposes) - and I found that only way to do that is this:
create a dshow graph that will be used only to capture the audio and video
then, at the end of the graph, insert samplegrabber filters, both for audio and video
then, use IWMWritter to create and save wmv file, using samples fetched from samplegrabber filters
when time comes, close one IWMWritter and create another one.
That way, you won't lose single frame when switching the output files.
Of course, there is also question of queue-ing and storing the samples (when switching the writters) and properly re-aligning the audio/video timestamps, but from my research, that's the only 'normal' way to do it, and I used in practice.
The solution is in writing a custom DShow filter with two input pins in your case. One for audio stream and the other for video stream. Inside that filter (doesn't have to be inside from the architecture point of view, because you can also use callbacks for example and do the job somewhere else) you should create asf files. While switching files, A/V data would be stored in cache (e.g. big enough circular buffer). You can also watch and modify A/V sync in that filter. For writing ASF files I would recommend Windows Media Format SDK.You can also add output pins if you like to pass A/V data further if necessary for preview, parallel streaming etc...
GMFBridge is a viable, but complicated solution, a more direct approach I have implemented in the past is querying your ASF Writer for the IWMWriterAdvanced2 interface and setting a custom sink. Within that interface you have methods to remove and add sinks to your ASF writer. The sink automatically connected will write to the file that you speficifed. One way to write whereever you want to is
1.) remove all default sinks:
pWriterAdv->RemoveSink(NULL);
2.) register a custom sink:
pWriterAdv->AddSink((IWMWriterSink*)&streamSink);
The custom sink can be a class that implements IWMWriterSink, which requires implementing callback methods that are called i.e. when the ASF header is written (OnHeader(/* [in] */ INSSBuffer *pHeader);) and when a data packet is written (OnDataUnit(/* [in] */ INSSBuffer *pDataUnit);) - in your implementation you can then write them wherever you want, for example offer additional methods on this class where you can specify the file name you want to write to.
Note that this solution does not quite get you were you want to if you need to write out the header information in each of the 60 minute files - after the initial header you will only get ASF packet data. A workaround for that could be to re-write the intial header before any packet data of each file, however this will produce an unindexed (non-seekable) ASF file.

Resources