How to get two media stream from one File (for Custom Mixer) - ms-media-foundation

There is one file with two video stream.
I want to mix these two stream to make one output stream.(using media session)
I think we can write the topology as shown below.
https://msdn.microsoft.com/en-us/library/windows/desktop/ms701605(v=vs.85).aspx
IMFActivate *pSinkActivate = NULL;
IMFTopologyNode *pOutputNode = NULL;
pPD->GetStreamDescriptorCount(&cSourceStreams);
// For each stream, create the topology nodes and add them to the topology.
for (DWORD i = 0; i < cSourceStreams; i++)
{
IMFStreamDescriptor *pSD = NULL;
IMFTopologyNode *pSourceNode = NULL;
pPD->GetStreamDescriptorByIndex(i, &fSelected, &pSD);
// Create the media sink activation object.
if (i == 0)
CreateMediaSinkActivate(pSD, hVideoWnd, &pSinkActivate);
// Add a source node for this stream.
AddSourceNode(pTopology, pSource, pPD, pSD, &pSourceNode);
// Create the output node for the renderer.
if (i == 0)
AddOutputNode(pTopology, pSinkActivate, 0, &pOutputNode);
// Connect the source node to the output node.
pSourceNode->ConnectOutput(0, pOutputNode, 0);
SafeRelease(&pSD);
SafeRelease(&pSourceNode);
}
SafeRelease(&pSinkActivate);
SafeRelease(&pOutputNode);
I don't know if I'm doing well.
please help me.
thanks.

Related

NewTek NDI (SDK v5) with Qt6.3: How to display NDI video frames on the GUI?

I have integrated the NDI SDK from NewTek in the current version 5 into my Qt6.3 widget project.
I copied and included the required DLLs and header files from the NDI SDK installation directory into my project.
To test my build environment I tried to compile a simple test program based on the example from "..\NDI 5 SDK\Examples\C++\NDIlib_Recv".
That was also successful.
I was therefore able to receive or access data from my NDI source.
There is therefore a valid frame in the video_frame of the type NDIlib_video_frame_v2_t. Within the structure I can also query correct data of the frame such as the size (.xres and .yres).
The pointer p_data points to the actual data.
So far so good.
Of course, I now want to display this frame on the Qt6 GUI. In other words, the only thing missing now is the conversion into an appropriate format so that I can display the frame with QImage, QPixmap, QLabel, etc.
But how?
So far I've tried calls like this:
curFrame = QImage(video_frame.p_data, video_frame.xres, video_frame.yres, QImage::Format::Format_RGB888);
curFrame.save("out.jpg");
I'm not sure if the format is correct either.
Here's a closer look at the mentioned frame structure within the Qt debug session:
my NDI video frame in the Qt Debug session, after receiving
Within "video_frame" you can see the specification video_type_UYVY.
This may really be the format as it appears at the source!?
Fine, but how do I get this converted now?
Many thanks and best regards
You mean something like this? :)
https://github.com/NightVsKnight/QtNdiMonitorCapture
Specifically:
https://github.com/NightVsKnight/QtNdiMonitorCapture/blob/main/lib/ndireceiverworker.cpp
Assuming you connect using NDIlib_recv_color_format_best:
NDIlib_recv_create_v3_t recv_desc;
recv_desc.p_ndi_recv_name = "QtNdiMonitorCapture";
recv_desc.source_to_connect_to = ...;
recv_desc.color_format = NDIlib_recv_color_format_best;
recv_desc.bandwidth = NDIlib_recv_bandwidth_highest;
recv_desc.allow_video_fields = true;
pNdiRecv = NDIlib_recv_create_v3(&recv_desc);
Then when you receive a NDIlib_video_frame_v2_t:
void NdiReceiverWorker::processVideo(
NDIlib_video_frame_v2_t *pNdiVideoFrame,
QList<QVideoSink*> *videoSinks)
{
auto ndiWidth = pNdiVideoFrame->xres;
auto ndiHeight = pNdiVideoFrame->yres;
auto ndiLineStrideInBytes = pNdiVideoFrame->line_stride_in_bytes;
auto ndiPixelFormat = pNdiVideoFrame->FourCC;
auto pixelFormat = NdiWrapper::ndiPixelFormatToPixelFormat(ndiPixelFormat);
if (pixelFormat == QVideoFrameFormat::PixelFormat::Format_Invalid)
{
qDebug().nospace() << "Unsupported pNdiVideoFrame->FourCC " << NdiWrapper::ndiFourCCToString(ndiPixelFormat) << "; return;";
return;
}
QSize videoFrameSize(ndiWidth, ndiHeight);
QVideoFrameFormat videoFrameFormat(videoFrameSize, pixelFormat);
QVideoFrame videoFrame(videoFrameFormat);
if (!videoFrame.map(QVideoFrame::WriteOnly))
{
qWarning() << "videoFrame.map(QVideoFrame::WriteOnly) failed; return;";
return;
}
auto pDstY = videoFrame.bits(0);
auto pSrcY = pNdiVideoFrame->p_data;
auto pDstUV = videoFrame.bits(1);
auto pSrcUV = pSrcY + (ndiLineStrideInBytes * ndiHeight);
for (int line = 0; line < ndiHeight; ++line)
{
memcpy(pDstY, pSrcY, ndiLineStrideInBytes);
pDstY += ndiLineStrideInBytes;
pSrcY += ndiLineStrideInBytes;
if (pDstUV)
{
// For now QVideoFrameFormat/QVideoFrame does not support P216. :(
// I have started the conversation to have it added, but that may take awhile. :(
// Until then, copying only every other UV line is a cheap way to downsample P216's 4:2:2 to P016's 4:2:0 chroma sampling.
// There are still a few visible artifacts on the screen, but it is passable.
if (line % 2)
{
memcpy(pDstUV, pSrcUV, ndiLineStrideInBytes);
pDstUV += ndiLineStrideInBytes;
}
pSrcUV += ndiLineStrideInBytes;
}
}
videoFrame.unmap();
foreach(QVideoSink *videoSink, *videoSinks)
{
videoSink->setVideoFrame(videoFrame);
}
}
QVideoFrameFormat::PixelFormat NdiWrapper::ndiPixelFormatToPixelFormat(enum NDIlib_FourCC_video_type_e ndiFourCC)
{
switch(ndiFourCC)
{
case NDIlib_FourCC_video_type_UYVY:
return QVideoFrameFormat::PixelFormat::Format_UYVY;
case NDIlib_FourCC_video_type_UYVA:
return QVideoFrameFormat::PixelFormat::Format_UYVY;
break;
// Result when requesting NDIlib_recv_color_format_best
case NDIlib_FourCC_video_type_P216:
return QVideoFrameFormat::PixelFormat::Format_P016;
//case NDIlib_FourCC_video_type_PA16:
// return QVideoFrameFormat::PixelFormat::?;
case NDIlib_FourCC_video_type_YV12:
return QVideoFrameFormat::PixelFormat::Format_YV12;
//case NDIlib_FourCC_video_type_I420:
// return QVideoFrameFormat::PixelFormat::?
case NDIlib_FourCC_video_type_NV12:
return QVideoFrameFormat::PixelFormat::Format_NV12;
case NDIlib_FourCC_video_type_BGRA:
return QVideoFrameFormat::PixelFormat::Format_BGRA8888;
case NDIlib_FourCC_video_type_BGRX:
return QVideoFrameFormat::PixelFormat::Format_BGRX8888;
case NDIlib_FourCC_video_type_RGBA:
return QVideoFrameFormat::PixelFormat::Format_RGBA8888;
case NDIlib_FourCC_video_type_RGBX:
return QVideoFrameFormat::PixelFormat::Format_RGBX8888;
default:
return QVideoFrameFormat::PixelFormat::Format_Invalid;
}
}

Extract text with iText not works: encoding or crypted text?

I have a pdf file that as the follow security properties: printing: allowed; document assembly: NOT allowed; content copy: allowed; content copy for accessibility: allowed; page extraction:NOT allowed;
I try to get text with sample code as documentation sample as follow:
pdftext.Text = null;
StringBuilder text = new StringBuilder();
PdfReader pdfReader = new PdfReader(filename);
for (int page = 1; page <= pdfReader.NumberOfPages; page++)
{
ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
string currentText = PdfTextExtractor.GetTextFromPage(pdfReader, page, strategy);
text.Append(System.Environment.NewLine);
text.Append("\n Page Number:" + page);
text.Append(System.Environment.NewLine);
currentText = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(currentText)));
text.Append(currentText);
progressBar1.Value++;
}
pdftext.Text += text.ToString();
pdfReader.Close();
but the output text is lines with ""??? ? ???????\n?? ??? ? " values;
seems that file is crypted or we have a encoding problem...
note that in the follow lines
var f = pdfReader.IsOpenedWithFullPermissions; -> FALSE
var f1 = pdfReader.IsEncrypted(); - > FALSE
var f2 = pdfReader.ComputeUserPassword(); - > NULL
var f3 = pdfReader.Is128Key(); - > FALSE
var f4 = pdfReader.HasUsageRights();
f, f1, f3, f4 return FALSE ...than seems that the document is not crypted,
...so I don't know if is a Encoding problem or question related to encrypet strings...
Someone can help me?
thanks in advance.
G.G.
Whenever you have trouble extracting text from a document using standard code, the first thing to do is try and copy&paste the text from it using Adobe Acrobat Reader. Adobe Reader copy&paste implements text extraction according to the recommendations of the PDF specification, and if this fails, this usually means that the necessary information required for text extraction in the document are either missing or broken (by accident or by design). To extract the text, one either needs to customize the code specifically to the specific PDF or resort to OCR.
In case of the document at hand, Adobe Reader copy&paste does result in garbage, too, just like when extracting with iText. Thus, there is something fishy in the document.
Inspecting the document one finds that the fonts contain ToUnicode mappings like this:
/CIDInit /ProcSet
findresource begin 12 dict begin begincmap /CIDSystemInfo<</Registry(Adobe)
/Ordering(Identity)
/Supplement 0
>>
def
/CMapName/F18 def
1 begincodespacerange <0000> <FFFF> endcodespacerange
44 beginbfrange
<20> <20> <0020>
<21> <21> <E0F9>
<22> <22> <E0F1>
<23> <23> <E0FA>
<24> <24> <E0F7>
<25> <25> <E0A3>
<26> <26> <E084>
<27> <27> <E097>
<28> <28> <E098>
<29> <29> <E09A>
<2A> <2A> <E08A>
<2B> <2B> <E099>
<2C> <2C> <E0A5>
<2D> <2D> <E086>
<2E> <2E> <E094>
<2F> <2F> <E0DE>
<30> <30> <E0A6>
<31> <31> <E096>
<32> <32> <E088>
<33> <33> <E082>
<34> <34> <E04C>
<35> <35> <E0A4>
<36> <36> <E0F6>
<37> <37> <E0F2>
<38> <38> <E0D8>
<39> <39> <E0AA>
<3A> <3A> <E06C>
<3B> <3B> <E087>
<3C> <3C> <E095>
<3D> <3D> <E0C4>
<3E> <3E> <E07E>
<3F> <3F> <E055>
<40> <40> <E089>
<41> <41> <E085>
<42> <42> <E083>
<43> <43> <E070>
<44> <44> <E0E6>
<45> <45> <E080>
<46> <46> <E0C8>
<47> <47> <E0F4>
<48> <48> <E062>
<49> <49> <E0F3>
<4A> <4A> <E04E>
<4B> <4B> <E05E>
endbfrange
endcmap CMapName currentdict /CMap defineresource pop end end
I.e., if you are not into this, the fonts claim that all their glyphs (with the exception of the space glyph at 0x20) represent characters U+E0xx from the Unicode private use area. As the name of that area indicates, there is no common meaning of characters with these values.
Thus, text extraction according to the PDF specification will return strings of characters with undefined meaning with results as you observed in iText or I saw in Adobe Reader.
Sometimes in such a situation one can still enforce proper text extraction by ignoring the ToUnicode map and using either the font Encoding or information inside the embedded font program.
Unfortunately it turns out that here the Encoding effectively contains the same information as does the ToUnicode map, e.g. for the same font as above
/Differences [ 32 /space /uniE0F9 /uniE0F1 /uniE0FA /uniE0F7 /uniE0A3 /uniE084 /uniE097 /uniE098
/uniE09A /uniE08A /uniE099 /uniE0A5 /uniE086 /uniE094 /uniE0DE /uniE0A6 /uniE096
/uniE088 /uniE082 /uniE04C /uniE0A4 /uniE0F6 /uniE0F2 /uniE0D8 /uniE0AA /uniE06C
/uniE087 /uniE095 /uniE0C4 /uniE07E /uniE055 /uniE089 /uniE085 /uniE083 /uniE070
/uniE0E6 /uniE080 /uniE0C8 /uniE0F4 /uniE062 /uniE0F3 /uniE04E /uniE05E ]
and the fonts turns out to be Type3 fonts, i.e. there is no embedded font program but each glyph is defined as an individual PDF canvas without further character information.
Thus, nothing to gain here either.
Actually these small PDF canvasses contain inlined bitmap graphics of the respective glyph which also is the cause of the poor graphical quality of the document (if you don't see that immediately, simply zoom in a bit and you'll see the ragged outlines of the glyphs).
By the way, such a construct usually means that the producer of the PDF explicitly wants to prevent text extraction.
If you happen to have to extract text from many such documents, you can try and determine a mapping from their U+E0xx characters to actually sensible Unicode characters and apply that mapping to your extracted text.
If all those fonts in all those documents happen to use the same U+E0xx codepoints for the same actual characters, you'll be able to do text extraction from those documents after investing a certain amount of initial work.
Otherwise do try OCR.
The following code adds pages to a document which map the ToUnicode values to the characters shown:
void AddFontsTo(PdfReader reader, PdfStamper stamper)
{
int documentPages = reader.NumberOfPages;
for (int page = 1; page <= documentPages; page++)
{
// ignore inherited resources for now
PdfDictionary pageResources = reader.GetPageResources(page);
if (pageResources == null)
continue;
PdfDictionary pageFonts = pageResources.GetAsDict(PdfName.FONT);
if (pageFonts == null || pageFonts.Size == 0)
continue;
List<BaseFont> fonts = new List<BaseFont>();
List<string> fontNames = new List<string>();
HashSet<char> chars = new HashSet<char>();
foreach (PdfName key in pageFonts.Keys)
{
PdfIndirectReference fontReference = pageFonts.GetAsIndirectObject(key);
if (fontReference == null)
continue;
DocumentFont font = (DocumentFont) BaseFont.CreateFont((PRIndirectReference)fontReference);
if (font == null)
continue;
PdfObject toUni = PdfReader.GetPdfObjectRelease(font.FontDictionary.Get(PdfName.TOUNICODE));
CMapToUnicode toUnicodeCmap = null;
if (toUni is PRStream)
{
try
{
byte[] touni = PdfReader.GetStreamBytes((PRStream)toUni);
CidLocationFromByte lb = new CidLocationFromByte(touni);
toUnicodeCmap = new CMapToUnicode();
CMapParserEx.ParseCid("", toUnicodeCmap, lb);
}
catch
{
toUnicodeCmap = null;
}
}
if (toUnicodeCmap == null)
continue;
ICollection<int> mapValues = toUnicodeCmap.CreateDirectMapping().Values;
if (mapValues.Count == 0)
continue;
fonts.Add(font);
fontNames.Add(key.ToString());
foreach (int value in mapValues)
chars.Add((char)value);
}
if (fonts.Count == 0 || chars.Count == 0)
continue;
Rectangle size = (fonts.Count > 10) ? PageSize.A4.Rotate() : PageSize.A4;
PdfPTable table = new PdfPTable(fonts.Count + 1);
table.AddCell("Page " + page);
foreach (String name in fontNames)
{
table.AddCell(name);
}
table.HeaderRows = 1;
float[] widths = new float[fonts.Count + 1];
widths[0] = 2;
for (int i = 1; i <= fonts.Count; i++)
widths[i] = 1;
table.SetWidths(widths);
table.WidthPercentage = 100;
List<char> charList = new List<char>(chars);
charList.Sort();
foreach (char character in charList)
{
table.AddCell(((int)character).ToString("X4"));
foreach (BaseFont font in fonts)
{
table.AddCell(new PdfPCell(new Phrase(character.ToString(), new Font(font))));
}
}
stamper.InsertPage(reader.NumberOfPages + 1, size);
ColumnText columnText = new ColumnText(stamper.GetUnderContent(reader.NumberOfPages));
columnText.AddElement(table);
columnText.SetSimpleColumn(size);
while ((ColumnText.NO_MORE_TEXT & columnText.Go(false)) == 0)
{
stamper.InsertPage(reader.NumberOfPages + 1, size);
columnText.Canvas = stamper.GetUnderContent(reader.NumberOfPages);
columnText.SetSimpleColumn(size);
}
}
}
I applied it to your document like this:
string input = #"4700198773.pdf";
string output = #"4700198773-fonts.pdf";
using (PdfReader reader = new PdfReader(input))
using (FileStream stream = new FileStream(output, FileMode.Create, FileAccess.Write))
using (PdfStamper stamper = new PdfStamper(reader, stream))
{
AddFontsTo(reader, stamper);
}
The additional pages look like this:
Now you have to compare the outputs for the different fonts and pages of this document with each other and with those of a representative selection of file. If you find good enough a pattern, you can try this replacement way.

How do I translate LR(1) Parse into a Abstract syntax tree?

I have coded a table driven LR(1) parser and it is working very well however I am having a bit of a disconnect on the stage of turing a parse into a syntax tree/abstract syntax tree. This is a project that I m very passionate about but I have really just hit a dead end here. Thank you for your help in advance.
Edit: Also my parser just uses a 2d array and an action object that tells it where to go next or if its a reduction where to go and how many items to pop. I noticed that many people use the visitor pattern. Im not sure how they know what type of node to make.
Here is the pushdown automata for context
while (lexer.hasNext() || parseStack.size() > 0) {
Action topOfStack = parseStack.peek();
token = parseStack.size() > 0 ? lexer.nextToken() : new Token(TokenType.EOF, "EOF");
topOfStack.setToken(token);
int row = topOfStack.getTransitionIndex();
int column = getTerminalIndex(token.getLexeme());
column = token.getType() == TokenType.IDENTIFIER
&& !terminalsContain(token.getLexeme()) ? 0 : column;
Action action = actionTable[row][column];
if (action instanceof Accept) {
System.out.println("valid parse!!!!!!");
} else if (action instanceof Reduction) {
Reduction reduction = (Reduction) action;
popStack(parseStack, reduction.getNumberOfItemsToPop());
column = reduction.getTransitionIndex();
row = parseStack.peek().getTransitionIndex();
parseStack.push(new Action(gotoTable[row][column]));
lexer.backupTokenStream();
} else if (action != null) {
parseStack.push(actionTable[row][column]);
} else {
System.out.println("Parse error");
System.out.println("On token: " + token.getLexeme());
break;
}
Each reduction in the LR parsing process corresponds to an internal node in the parse tree. The rule being reduced is the internal AST node, and the items popped off the stack correspond to the children of that internal node. The item pushed for the goto corresponds to the internal node, while those pushed by shift actions correspond to leaves (tokens) of the AST.
Putting all that together, you can easily build an AST by createing a new internal node every time you do a reduction and wiring everything together appropriately.

Firebase FQuery how do you detect when at the end of a list of nodes

How do I detect when I have finished processing all found nodes when doing a query? In the following example, I do some processing on each encountered node. When I reach the "end" of the list I would like to be able to detect this so I know it's finished.
FQuery* messageListQuery = [m_firebaseRef queryLimitedToNumberOfChildren:100];
[messageListQuery observeEventType:FEventTypeChildAdded andPreviousSiblingNameWithBlock:^(FDataSnapshot *snapshot, NSString *prevNodeName) {
// 1. Do interesting stuff with the snapshot data
// 2. I want to detect when I'm at the end of the list so I know when I'm done processing the list.
}];
Here is the example use case. I would like to load the latest 100 messages in the background. Once the messages have been loaded, I would like to update the UI. However, I'm not sure how I know all the messages have been loaded given there might be less then 100 messages in the list.
I figured out how to read all the messages up front by using the observeSingleEventOfType and then iterating over the children.
[m_firebaseRef observeSingleEventOfType:FEventTypeValue withBlock:^(FDataSnapshot *snapshot) {
NSLog( #"Name %# with %d children.", snapshot.name, snapshot.childrenCount );
for( FDataSnapshot *child in snapshot.children )
{
NSDictionary *msgData = child.value;
NSString *message = msgData[kFirebaseLiveChatFieldMessage];
NSString *gamerTag = msgData[kFirebaseLiveChatFieldGamerTag];
NSString *gameCenterId = msgData[kFirebaseLiveChatFieldGameCenterId];
NSLog( #"Preload = %# (%#): %#", gamerTag, gameCenterId, message );
}
}];

To change the metatags of PNG Image files

I am using PngCs dll to fetch the chunk data for Png image file in asp.net, I am able to do that but now I want to update the chunk data for that PNG.
I used PngWriter but it is creating whole new file without inheriting chunk data.
PngReader pngr = FileHelper.CreatePngReader(path);
pngr.GetMetadata().GetTxtForKey(PngChunkITXT.KEY_Title);
Response.Write(pngr.GetMetadata().GetTxtForKey(PngChunkITXT.KEY_Title));
Below code is for writing new Png Image through PngWriter ,I want embed new itxt chunk while creating new file.
PngReader pngr = FileHelper.CreatePngReader(origFilename); // or you can use the constructor
PngWriter pngw = FileHelper.CreatePngWriter(destFilename, pngr.ImgInfo, true); // idem
Console.WriteLine(pngr.ToString()); // just information
int chunkBehav = ChunkCopyBehaviour.COPY_ALL_SAFE; // tell to copy all 'safe' chunks
pngw.CopyChunksFirst(pngr, chunkBehav); // copy some metadata from reader
for (int row = 0; row < pngr.ImgInfo.Rows; row++)
{
ImageLine l1 = pngr.ReadRowInt(row); // format: RGBRGB... or RGBARGBA...
pngw.WriteRow(l1, row);
}
pngw.CopyChunksLast(pngr, chunkBehav); // metadata after the image pixels? can happen
pngw.End(); // dont forget this
pngr.End();
for further reference click this link
Try this
PngReader pngr = FileHelper.CreatePngReader(origFilename);
PngWriter pngw = FileHelper.CreatePngWriter(destFilename, pngr.ImgInfo, true);
pngw.CopyChunksFirst(pngr, ChunkCopyBehaviour.COPY_ALL);
pngw.GetMetadata().SetText(myKey, myText,false,false); // provide your own data
for (int row = 0; row < pngr.ImgInfo.Rows; row++) {
ImageLine l1 = pngr.ReadRowInt(row);
pngw.WriteRow(l1, row);
}
pngw.CopyChunksLast(pngr, ChunkCopyBehaviour.COPY_ALL);
pngw.End(); // dont forget this
pngr.End();
the problem has been solved by using CsXMpToolKit.dll ,which is the best option to fetch the metadat from any type of file.

Resources