I just want to query the number of streams in a file. But an unimaginable difficulty has emerged from this simple task.
It seems the query involves using IMediaObject.I have searched IMediaObject documentation in DirectShow. It only lists out its functions but it has no samples and description on how to use it.
I have also searched Windows 7 SDK. The only demonstration is in dmoenum
The initiation is incapsulated in ShowSelectedDMOInfo(const GUID *pCLSID)
what types can pCLSID be? Any samples are out there to illustrate how to use IMediaObject?
I just want to query the number of streams in a file
IMediaObject is not of any help. It only returns number of streams it is designed to accept on the input and deliver on the output, according to its design. Typical DMO has one input and one output stream, completely irrelevant to file streams.
In DirectShow you can query streams from demultiplexing filter for the respective file format. These are rarely (if ever) packaged as DMOs.
Related
It seems like I'm hitting a 1GB upper-boundary on my U-SQL input file size. Is there such a limit, and if so, how can this be increased?
Here's my case in a nutshell:
I'm working on a custom xml extractor where I'm processing XML files of roughly 2,5gb. These XML files conform to well maintained XSD schemas. using xsd.exe I've generated .NET classes for Xml serialization. The custom extractor uses these desialized .NET objects to populate the output rows.
This all works pretty neat running U-SQL on my local ADLA Account from Visual Studio. Memory usage goes up to approx 3 gb for a 2,5 gb input xml, so this should perfectly fit on a single vertex per file.
This still works great using <1gb input files on the Data Lake.
However, when trying to scale things up at the Data Lake Store, it seems the job got terminated by hitting the 1gb input file size boundary.
I know streaming the outer XML, and then serializing the inner XML fragments is an alternative option, but we don't want to create - and particularly maintain - too much custom code depending on those externally managed schemas.
Therefore, raising the upper-limit would be great.
I see two issues right now. One that we can address, and one for which we have a feature under development for later this year.
U-SQL per default assumes that you want to scale out processing over your file and will split it into 1GB "chunks" for extraction. If your extractor needs to see all the data (e.g., in order to parse XML or JSON or an image for example) you need to mark the extractor to process the files atomically (not splitting it) in the following way:
[SqlUserDefinedExtractor(AtomicFileProcessing = true)]
public class MyExtractor : IExtractor
{ ...
Now while a vertex has 3GB of data, we currently limit the memory size for a UDO like an extractor to 500MB. So if you process your XML in a way that requires a lot of memory, you will currently still fail with a System.OutOfMemory error. We are working on adding annotations to the UDOs that let you specify your memory requirements to overwrite the default, but that is still under development at this point. The only ways to address that is to either make your data small enough, or - in the case of XML for example - use a streaming parsing strategy that does not allocate too much memory (e.g., use the XML Reader interface).
I have multiple flatfiles (CSV) (with multiple records) where files will be received randomly. I have to combine them (records) with unique ID fields.
How can I combine them, if there is no common unique field for all files, and I don't know which one will be received first?
Here are some files examples:
In real there are 16 files.
Fields and records are much more then in this example.
I would avoid trying to do this purely in XSLT/BizTalk orchestrations/C# code. These are fairly simple flat files. Load them into SQL, and create a view to join your data up.
You can still use BizTalk to pickup/load the files. You can also still use BizTalk to execute the view or procedure that joins the data up and sends your final message.
There are a few questions that might help guide how this would work here:
When do you want to join the data together? What triggers that (a time of day, a certain number of messages received, a certain type of message, a particular record, etc)? How will BizTalk know when it's received enough/the right data to join?
What does a canonical version of this data look like? Does all of the data from all of these files truly get correlated into one entity (e.g. a "Trade" or a "Transfer" etc.)?
I'd probably start with defining my canonical entity, and then look towards the path of getting a "complete" picture of that canonical entity by using SQL for this kind of case.
I was wondering if there's a way to build a QR code with two kinds of data - one text data and two link URLs. Is it possible to do it?
A QR Code is a two-dimensional barcode capable of storing (according to Wikipedia) up to 2,953 bytes of binary data or 4,296 simple alphanumeric characters. The data can contain whatever you like.
The difficulty with storing multiple URLs in a QR-code is not that it is impossible, but that most scanner apps in smart phones and so on will only process a single URL. If you are writing the scanner app too then, yes, it it possible, otherwise it is possible but probably not advisable.
If you wish to store a single URL and some contact details you might look at storing a vCard in your QR code (here is a generator; I have no affiliation with this project).
It's indeed possible, but all scanner apps will not recognize all the data, and only one show one data. This QR code generator has a Multi URL feature that can redirect based on different parameters as time, location, device, ...
It is possible. we can enter text,URL,v card on a single QR code.
Well, actually, the QR code is "only" storing characters, so you could imagine having an app or any software that read the QR code content, which contains data and two URL, which split the string to open two tab.
In my mobile app (hybrid), I want to allow the user to take his data to another device. There will be no server side components from my end. The data user would carry would contain images, audio, video along with text and timestamps etc. My design evolved as below
1. Store each entry in a JSON file with image, audio and video as Data URI and export this file to cloud sync platforms. The problem with this approach is that, even though JSON is better than XML, there could be better options. See below
2. Store each entry in a BSON file with image, audio and video as Data URI and export this file to cloud sync platforms. The problem with this approach is that as mentioned in its site still the field names will be repeated and protobuf could be a better fit.
3. Store each entry in a protocol buffer file with image, audio and video as Data URI and export this file to cloud sync platforms.
Then when I stumbled across greenDAO they were mentioning
greenDAO lets you persist protocol buffer (protobuf) objects directly
into the database.
What is the benefit I will be getting by storing the protobuf object in sqlite DB? Will be able to export sqlite file instead of file containing object in protobuf format?
Well, the data still has to be serialized somehow into the database. greenDAO just hides the serialization from you. Since you have specific needs, you are probably best building your own solution, better tailored for your needs.
If you don't anticipate the field names changing, why not just store the entries as database rows? This has a number of nice advantages, including the ability to have sortable and searchable entries.
I need to retrieve non-video, non-audio application data which is embedded in an MP4 file. The data consists of measurements taken at the same time as the MP4 was recorded, which need to be rendered as charts in sync with the video & audio. The charts won't be rendered using DirectShow.
The data can be written into the MP4 file in one of three ways:
1. as multiple top-level mdat boxes
2. as multiple top-level boxes with proprietary FourCC
3. as a third track.
Which of the above methods of embedding the data would be most appropriate for DirectShow? What would the steps be to retrieve the data?
I have sample MP4 files in all of the three above formats and I can play the video and audio using Haali splitter. Does it come down to whether the MP4 source filter supports the reading of data? I would like to avoid having to write my own MP4 source filter if possible!
Many thanks
As you might have known, there is no stock filter for MP4. And your best way is to see what exactly is supported on the filter that you are going to use. For example, it is highly unlikely that these filters are going to make custom format data available.
The good news is that decent multiplexer/demultiplexer MP4 filters are available in source http://www.gdcl.co.uk/mpeg4/ If the measurements are timestamped, then additional track looks best to me. You can always put extra data into track description box. Source code availability enables you to add reasonable support for your custom format without much of a trouble.