i want to read the contents from a pdf file to another pdf file using pdfsharp - pdfsharp

I want to read the contents of my pdf file into another pdf file using pdfsharp, here is my code below, i just wanted to read the contents either in the form of bytes or in string format, but by using pdfsharp.
public static void Main()
{
PdfDocument pdf = PdfReader.Open(#"C:\backup_temp\Template.pdf");
string outputText= "";
foreach (PdfPage page in pdf.Pages)
{
for (int index = 0; index < page.Contents.Elements.Count; index++)
{
PdfDictionary.PdfStream stream = Page.Contents.Elements.GetDictionary(index).Stream;
outputText += new PDFParser().ExtractTextFromPDFBytes(stream.Value);
}
}
string pdfFilename = #"C:\backup_temp\source.pdf";
pdf.Save(pdfFilename);
}

Related

how to detect while picking video is corrupted in xamarin android

I am using Xamarin essential using Multipicker while choosing corrupted video its also selected in the Media list when upload time it showing setdatasource failed. How to detect when choosing at time video is corrupted or any other option
result = await Xamarin.Essentials.FilePicker.PickMultipleAsync(new Xamarin.Essentials.PickOptions
{
FileTypes = Xamarin.Essentials.FilePickerFileType.Videos,
PickerTitle = "Please pick a videos"
});
If you want to detect corrupted video while picking video, you could compare the MD5 value.
But on this way, you need to have correct the MD5 value. For example, i put the correct MD5 value in .txt file of Assets folder and then get the MD5 value when picking video and compare it.
public void Compare()
{
string content;
AssetManager asset1 = Android.App.Application.Context.Assets;
using (StreamReader sr = new StreamReader(asset1.Open("AboutAssets.txt")))
{
content = sr.ReadToEnd();
}
var InPuthash= GetMd5Hash(content);
string hash = "sfsgDGDgds";
var result = VerifyMd5Hash(InPuthash, hash);
}
static string GetMd5Hash(string input)
{
using (MD5 md5Hash = MD5.Create())
{
// Convert the input string to a byte array and compute the hash.
byte[] data = md5Hash.ComputeHash(Encoding.UTF8.GetBytes(input));
// Create a new Stringbuilder to collect the bytes
// and create a string.
StringBuilder sBuilder = new StringBuilder();
// Loop through each byte of the hashed data
// and format each one as a hexadecimal string.
for (int i = 0; i < data.Length; i++)
{
sBuilder.Append(data[i].ToString("x2"));
}
// Return the hexadecimal string.
return sBuilder.ToString();
}
}
// Verify a hash against a string.
static bool VerifyMd5Hash(string input, string hash)
{
// Hash the input.
string hashOfInput = GetMd5Hash(input);
// Create a StringComparer an compare the hashes.
StringComparer comparer = StringComparer.OrdinalIgnoreCase;
return 0 == comparer.Compare(hashOfInput, hash);
}

Displaying mht file in iframe

I need to display mht file stored in zip archive in frame on the page
<iframe src="#Url.Action("LoadInstrucion","Pharmacy", new {id= Model.Instrukciya })"></iframe>
Action in MVC Controller returning file
public ActionResult LoadInstrucion(string id)
{
var bytes = InstructionsLoader.LoadInstrucion(id);
return File(bytes, "multipart/related");
}
Action for getting byte array from file
public static byte[] LoadInstrucion(string zipFileName)
{
string zipfilePath = $#"{HttpContext.Current.Request.PhysicalApplicationPath}Content\inst\{zipFileName}.zip";
if (File.Exists(zipfilePath))
{
using (var zipStream = new FileStream(zipfilePath, FileMode.Open))
using (var archive = new ZipArchive(zipStream, ZipArchiveMode.Read))
{
if (archive.Entries.Count > 0)
{
var file = archive.Entries[0];
var stream = file.Open();
using (var ms = new MemoryStream())
{
stream.CopyTo(ms);
return ms.ToArray();
}
}
}
}
return new byte[0];
}
If I navigate to Url I see the requested mht file, but it is not displayed in iframe. In Dev console I get warning:
Attempted to load a multipart archive into an subframe

Issue with Returning Epplus spreadsheets with an Image in .zip file using DotNetZip

This scenario works fine without any images on the spreadsheet, but after attempting to add an image to the spreadsheets that get put in the zip file, the spreadsheets open with the excel error of "We found a problem with some content ....".
I have other methods using Epplus without DotNetZip that use the exact same code to insert the image into a spreadsheet and they work fine with no errors or issues.
Code that works to return a single spreadsheet with an image
public async Task<ActionResult> GenerateSpreadsheet(ReportViewModel reportViewModel)
{
using (var excelPackage = new ExcelPackage())
{
Bitmap logoFile = getLogoFile();
var companyLogo = worksheet.Drawings.AddPicture("File Name", logoFile);
companyLogo.From.Column = columnIndex - 4;
companyLogo.From.Row = rowIndex;
companyLogo.SetSize(logoFile.Width, logoFile.Height);
//Write all the stuff to the spreadsheet
Response.ClearContent();
Response.BinaryWrite(excelPackage.GetAsByteArray());
string fileName = "attachment;filename=Project_Report_Export.xlsx";
Response.AddHeader("content-disposition", fileName);
Response.ContentType = "application/excel";
Response.Flush();
Response.End();
}
}
Code that will build a spreadsheet, add it to a zip file, but where the spreadsheet will open with the "We found a problem with some content ...." if an image was added to the spreadsheet as shown below. If there is no image added to it, it will open without the error.
public async Task<ActionResult> GenerateSpreadsheet(ReportViewModel reportViewModel)
{
using (var stream = new MemoryStream())
{
using (ZipFile zip = new ZipFile())
{
foreach(var spreadSheet in listOfStuffToBuildFrom)
{
using (var excelPackage = new ExcelPackage())
{
Bitmap logoFile = getLogoFile();
var companyLogo = worksheet.Drawings.AddPicture("File Name", logoFile);
companyLogo.From.Column = columnIndex - 4;
companyLogo.From.Row = rowIndex;
companyLogo.SetSize(logoFile.Width, logoFile.Height);
//Write all the stuff to the spreadsheet
//Add the workbook to the zip file
zip.AddEntry(excelPackage.Workbook.Properties.Title, excelPackage.GetAsByteArray());
}
}
zip.Save(stream);
return File(stream.ToArray(), System.Net.Mime.MediaTypeNames.Application.Zip, "Project Reports.zip");
}
}
}
Why does the second method return spreadsheets that open with the error "We found a problem with some content ...."??

Asp.net check if HttpPostedFileBase is a Word Document

I need a function which checks if the HttpPostedFileBase is a word document. I don't want to check against file extension because that can be changed by the user.
I tried to read the Header information of the binary data, which starts with PK (for example, PDF files starts with %PDF), but i don't know if i can rely upon that.
[HttpPost]
public ActionResult UploadFile(HttpPostedFileBase file)
{
string header = null;
using (MemoryStream ms = new MemoryStream())
{
file.InputStream.CopyTo(ms);
ms.Position = 0;
using (StreamReader sr = new StreamReader(ms))
{
char[] buffer = new char[5];
sr.Read(buffer, 0, 4);
header =
string.Format("{0}{1}{2}{3}{4}", buffer[0], buffer[1], buffer[2], buffer[3], buffer[4]);
}
}
if (header.StartsWith("%PDF"))
{
// PDF Document
}
if (header.StartsWith("PK"))
{
// Microsoft Word Document ?
}
return Json(new { }, JsonRequestBehavior.AllowGet);
}
The first two letters of a word document (DOCX) are PK because a DOCX file is actually a PKZip file; so no, this is not reliable.
The ForensicsWiki page here may help:
http://www.forensicswiki.org/wiki/Word_Document_%28DOC%29
and
http://www.forensicswiki.org/wiki/DOCX

Biztalk 2010 Custom Pipeline Component returns binary

I have created a custom pipeline component which transforms a complex excel spreadsheet to XML. The transformation works fine and I can write out the data to check. However when I assign this data to the BodyPart.Data part of the inMsg or a new message I always get a routing failure. When I look at the message in the admin console it appears that the body contains binary data (I presume the original excel) rather than the XML I have assigned - see screen shot below. I have followed numerous tutorials and many different ways of doing this but always get the same result.
My current code is:
public Microsoft.BizTalk.Message.Interop.IBaseMessage Execute(Microsoft.BizTalk.Component.Interop.IPipelineContext pc, Microsoft.BizTalk.Message.Interop.IBaseMessage inmsg)
{
//make sure we have something
if (inmsg == null || inmsg.BodyPart == null || inmsg.BodyPart.Data == null)
{
throw new ArgumentNullException("inmsg");
}
IBaseMessagePart bodyPart = inmsg.BodyPart;
//create a temporary directory
const string tempDir = #"C:\test\excel";
if (!Directory.Exists(tempDir))
{
Directory.CreateDirectory(tempDir);
}
//get the input filename
string inputFileName = Convert.ToString(inmsg.Context.Read("ReceivedFileName", "http://schemas.microsoft.com/BizTalk/2003/file-properties"));
swTemp.WriteLine("inputFileName: " + inputFileName);
//set path to write excel file
string excelPath = tempDir + #"\" + Path.GetFileName(inputFileName);
swTemp.WriteLine("excelPath: " + excelPath);
//write the excel file to a temporary folder
bodyPart = inmsg.BodyPart;
Stream inboundStream = bodyPart.GetOriginalDataStream();
Stream outFile = File.Create(excelPath);
inboundStream.CopyTo(outFile);
outFile.Close();
//process excel file to return XML
var spreadsheet = new SpreadSheet();
string strXmlOut = spreadsheet.ProcessWorkbook(excelPath);
//now build an XML doc to hold this data
XmlDocument xDoc = new XmlDocument();
xDoc.LoadXml(strXmlOut);
XmlDocument finalMsg = new XmlDocument();
XmlElement xEle;
xEle = finalMsg.CreateElement("ns0", "BizTalk_Test_Amey_Pipeline.textXML",
"http://tempuri.org/INT018_Workbook.xsd");
finalMsg.AppendChild(xEle);
finalMsg.FirstChild.InnerXml = xDoc.FirstChild.InnerXml;
//write xml to memory stream
swTemp.WriteLine("Write xml to memory stream");
MemoryStream streamXmlOut = new MemoryStream();
finalMsg.Save(streamXmlOut);
streamXmlOut.Position = 0;
inmsg.BodyPart.Data = streamXmlOut;
pc.ResourceTracker.AddResource(streamXmlOut);
return inmsg;
}
Here is a sample of writing the message back:
IBaseMessage Microsoft.BizTalk.Component.Interop.IComponent.Execute(IPipelineContext pContext, IBaseMessage pInMsg)
{
IBaseMessagePart bodyPart = pInMsg.BodyPart;
if (bodyPart != null)
{
using (Stream originalStrm = bodyPart.GetOriginalDataStream())
{
byte[] changedMessage = ConvertToBytes(ret);
using (Stream strm = new AsciiStream(originalStrm, changedMessage, resManager))
{
// Setup the custom stream to put it back in the message.
bodyPart.Data = strm;
pContext.ResourceTracker.AddResource(strm);
}
}
}
return pInMsg;
}
The AsciiStream used a method like this to read the stream:
override public int Read(byte[] buffer, int offset, int count)
{
int ret = 0;
int bytesRead = 0;
byte[] FixedData = this.changedBytes;
if (FixedData != null)
{
bytesRead = count > (FixedData.Length - overallOffset) ? FixedData.Length - overallOffset : count;
Array.Copy(FixedData, overallOffset, buffer, offset, bytesRead);
if (FixedData.Length == (bytesRead + overallOffset))
this.changedBytes = null;
// Increment the overall offset.
overallOffset += bytesRead;
offset += bytesRead;
count -= bytesRead;
ret += bytesRead;
}
return ret;
}
I would first of all add more logging to your component around the MemoryStream logic - maybe write the file out to the file system so you can make sure the Xml version is correct. You can also attach to the BizTalk process and step through the code for the component which makes debugging a lot easier.
I would try switching the use of MemoryStream to a more basic custom stream that writes the bytes for you. In the BizTalk SDK samples for pipeline components there are some examples for a custom stream. You would have to customize the stream sample so it just writes the stream. I can work on posting an example. So do the additional diagnostics above first.
Thanks,

Resources