Merging/filling pdf form file with xml data - asp.net

Let's say I have a pdf form file available at website which is filled by the users and submitted to the server. On the server side (Asp.Net) I would like to merge the data that I receive in xml format with the empty pdf form that was filled and save it.
As I have found there are several possible ways of doing it:
Using pdf form created by adobe acrobat and filling it with itextsharp.
Using pdf form created by adobe acrobat and filling it with FDF Toolkit .net (which seems to be using itextsharp internally)
Usd pdfkt to fill the form.
Use pdf form file created with adobe livecycle and merge the data by using Form Data Integration Service
As I have no experience with this kind of task can you advise which option would be better/easier and give some additional tips?
Thank you in advance.

I would suggest using the 4th approach if possible because it would be cleaner. You would be using solutions specifically tailored for what you are asking to do, but if you don't have the available resources for such a solution I would suggest using the 1st option.
The 1st option is what I have recently dove into. I have found it relatively painless to implement.
Option 1 is possible if the following applies:
You have control of development of PDF forms.
You have control of formating xml data
You have can live with having uncompressed (fastweb=false) PDF files
Example of implementation:
Using Adobe Acrobat to generate a PDF form. Tip: Use Adobe Native Fonts when generating the forms. For each control you add that is not a native font it will import the font used and bloat the file when it is not compressed, and to my knowledge ITextSharp currently does not produce compressed PDFs.
Using ITextSharp Library to combine XML data with the PDF form to generate a populated document. Tip: to manually populate a PDF form from xml you must map xml values to control names in the PDF form and match them by page as shown in the example below.
using (MemoryStream stream = GeneratePDF(m_FormsPath, oXmlData))
{
byte[] bytes = stream.ToArray();
Response.ContentType = "application/pdf";
Response.BinaryWrite(bytes);
Response.End();
}
// <summary>
// This method combines pdf forms with xml data
// </summary>
// <param name="m_FormName">pdf form file path</param>
// <param name="oData">xml dataset</param>
// <returns>memory stream containing the pdf data</returns>
private MemoryStream GeneratePDF(string m_FormName, XmlDocument oData)
{
PdfReader pdfTemplate;
PdfStamper stamper;
PdfReader tempPDF;
Document doc;
MemoryStream msTemp;
PdfWriter pCopy;
MemoryStream msOutput = new MemoryStream();
pdfTemplate = new PdfReader(m_FormName);
doc = new Document();
pCopy = new PdfCopy(doc, msOutput);
pCopy.AddViewerPreference(PdfName.PICKTRAYBYPDFSIZE, new PdfBoolean(true));
pCopy.AddViewerPreference(PdfName.PRINTSCALING, PdfName.NONE);
doc.Open();
for (int i = 1; i < pdfTemplate.NumberOfPages + 1; i++)
{
msTemp = new MemoryStream();
pdfTemplate = new PdfReader(m_FormName);
stamper = new PdfStamper(pdfTemplate, msTemp);
// map xml values to pdf form controls (element name = control name)
foreach (XmlElement oElem in oData.SelectNodes("/form/page" + i + "/*"))
{
stamper.AcroFields.SetField(oElem.Name, oElem.InnerText);
}
stamper.FormFlattening = true;
stamper.Close();
tempPDF = new PdfReader(msTemp.ToArray());
((PdfCopy)pCopy).AddPage(pCopy.GetImportedPage(tempPDF, i));
pCopy.FreeReader(tempPDF);
}
doc.Close();
return msOutput;
}
Save the File or post the file to the response of your ASP.Net page

Since you tagged this 'LiveCycle', I take it you have an installation of Adobe LiveCycle running somewhere (optionally, can install it somewhere).
In that case, I'd go for number 4 (with the modification of using the Adobe LiveCycle Forms ES module). The other three will undoubtedly yield compatibility issues in the long run. With the LiveCycle server (running the Forms module), you'll be able to handle any PDF, whether it's old, new, static, dynamic, compressed, Acrobat-based or LiveCycle-based.
You should be able to set things up, have the form send its data to the LiveCycle server, and use that data to populate the form. The fill can then be stored in the server's database, or routed into the PDF form (or any other form) and streamed back to the client.
Create the form using LiveCycle Designer.
The quick-and-dirty-option would be the following: Set the form to http-post (as for example an xfdf, see Acrobat for more info) to your ASP-server and publish it on the server (make sure your users don't download the form before opening it, otherwise this won't work. The form has to be opened in the web browser). Then simply capture the submissions as you would capture a http-post from a web page. Optionally, save the fill to a database. Then send the captured xfdf stream fill back to the client (could also be invoked at a later stage via a http-link). The xfdf stream will contain the URL of the form used to fill it out. The client web browser will ask the Acrobat/Adobe reader plug to handle the xfdf stream, and the plug will locate, download and populate the form pointed to by the xfdf.
The user should now be able to save the form AND it's fill - no Reader Extension needed!

You can also use iTextSharp to fill xml data into a Reader Extension enabled form. There are two things you need to set correctly:
Set PdfReader.unethicalreading = true to prevent BadPasswordException.
Set append mode in PdfStamper's constructor, otherwise the Adobe Reader Extensions signature becomes broken and Adobe Reader will display following message: "This document contained certain rights to enable special features in Adobe Reader. The document has been changed since it was created and these rights are no longer valid. Please contact the author for the original version of this document."
So all you need to do is this:
PdfReader.unethicalreading = true;
using (var pdfReader = new PdfReader("form.pdf"))
{
using (var outputStream = new FileStream("filled.pdf", FileMode.Create, FileAccess.Write))
{
using (var stamper = new iTextSharp.text.pdf.PdfStamper(pdfReader, outputStream, '\0', true))
{
stamper.AcroFields.Xfa.FillXfaForm("data.xml");
}
}
}
See How to fill XFA form using iText?

Related

Display Word and excel files in web page from database in asp.net c#

I have a database where i have saved documents like pdf, word(docx) and excel. I want to display them on web page on view button click. I am able to display pdf file using the below approach.
string embed = "<object data=\"{0}{1}\" type=\"application/pdf\" width=\"500px\"
height=\"600px\">";
ltEmbed.Text = string.Format(embed, ResolveUrl("~/Handler1.ashx?
id="+id+"&Name="+Name+""), temp);// literal control
in Handler1.ashx i have the below
string constr = ConfigurationManager.ConnectionStrings["myConnectionString"].ConnectionString;
using (SqlConnection con = new SqlConnection(constr))
{
using (SqlCommand cmd = new SqlCommand())
{
cmd.CommandText = "select pdfdoc from repository;
cmd.Connection = con;
con.Open();
using (SqlDataReader sdr = cmd.ExecuteReader())
{
sdr.Read();
bytes = (byte[])sdr["BPM_Doc"];
}
}
con.Close();
}
context.Response.Buffer = true;
context.Response.Charset = "";
context.Response.Cache.SetCacheability(HttpCacheability.NoCache);
context.Response.ContentType = "application/pdf";
context.Response.BinaryWrite(bytes);
context.Response.End();
How can i use the same approach to display word and excel. My word document has images too.
You can not directly view the Word documents and Excel spreadsheets within your ASP.NET web application. You will have to first render the documents in the form (i.e. HTML or image) that can be displayed on your web page. You can have a look at the following article that shows how to create a universal document viewer application in ASP.NET MVC using GroupDocs.Viewer for .NET API.
Document Viewer in ASP.NET Core MVC for 140+ File Formats
Disclosure: I work as a developer evangelist at GroupDocs.
The short answer is you can't - at least not the way your going about it even if you have the Office suite installed on your own machine.
PDF documents are handled by most browsers with their own implementation of Adobe's PDF reader - what I suspect your browser is doing, but Microsoft Office documents use a multitude of mimetypes, ranging from application/ms-word for the older Word files, to application/vnd.openxmlformats-officedocument.wordprocessingml.document for the newer generations of Office, so built-in browser support for even displaying them is either going to be technically unviable, or restricted by license.
Checkout this document viewer to build the reader/viewer into your project directly:
ASP.NET Document Viewer from GroupDocs
Note: This is not free by any means, it's actually quite expensive and I wish anyone luck trying to find one that will be cheap.
Also, don't be content with simply providing your user with the file itself unless you're happy they should have the means to open it.
In my line of work, the end users we deal with don't have Microsoft Office installed on their machines by design (licensing really), so providing information through PDF documents is the only way to go.

Generating Excel Documents with ASP.NET Website

I have an ASP.NET application that helps the user create a Gridview with certain data in it. Once this table is generated I want the user to push a button and be able to save the table as an Excel document.There are two different methods I know of:
Using HtmlTextWriter with ContentType "application/vnd.ms-excel" to send the file as an HttpResponse. I use GridView1.RenderControl(htmlTextWriter) to render the gridview. This almost works, but the excel file always shows a warning when the file opens because the content doesn't match the extension. I have tried various content types to no avail. This makes sense I guess, because I'm using an HtmlWriter. It also doesn't seem a good practice.
The second thing I've tried is generating the Excel file using Office Automation. But for the file to be generated, I need to save it to disk and then read it again. From what I have read, this is the only way, because the Excel object only becomes a real Excel file once you save it. I found that the .saveas method from the Excel class would throw an exception because of write permissions, even if I tried to save in the App_Data folder. So I did some research and found that apparently Office Automation is discouraged for web services: https://support.microsoft.com/en-us/kb/257757
Microsoft does not currently recommend, and does not support,
Automation of Microsoft Office applications from any unattended,
non-interactive client application or component (including ASP,
ASP.NET, DCOM, and NT Services), because Office may exhibit unstable
behavior and/or deadlock when Office is run in this environment.
There surely must be a save way to have a website generate an Excel file and offer it to the user!? I can't imagine that this problem is unsolved or so rare that nobody cares about it, but yet I can't find any good solution to this.
the easiest (and best) way to create an excel file is by using epplus
Epplus sample for webapplication
using (ExcelPackage pck = new ExcelPackage())
{
ExcelWorksheet ws = pck.Workbook.Worksheets.Add("Demo");
//Load the datatable into the sheet, starting from cell A1. Print the column names on row 1
ws.Cells["A1"].LoadFromDataTable(tbl, true);
//Format the header for column 1-3
using (ExcelRange rng = ws.Cells["A1:C1"])
{
rng.Style.Font.Bold = true;
rng.Style.Fill.PatternType = ExcelFillStyle.Solid; //Set Pattern for the background to Solid
rng.Style.Fill.BackgroundColor.SetColor(Color.FromArgb(79, 129, 189)); //Set color to dark blue
rng.Style.Font.Color.SetColor(Color.White);
}
//Example how to Format Column 1 as numeric
using (ExcelRange col = ws.Cells[2, 1, 2 + tbl.Rows.Count, 1])
{
col.Style.Numberformat.Format = "#,##0.00";
col.Style.HorizontalAlignment = ExcelHorizontalAlignment.Right;
}
//Write it back to the client
Response.ContentType = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet";
Response.AddHeader("content-disposition", "attachment; filename=ExcelDemo.xlsx");
Response.BinaryWrite(pck.GetAsByteArray());
}

Providing PDF specific parameters in an ASP.NET Generic Handler writing a PDF BLOB stream

EDIT: modified Title to be more specific
I've created a generic handler in VS2012 using their basic template as a starting point and modified it to grab a pdf from our sqlserver. The primary code block is this:
buffer = DirectCast(rsp.ScalarValue, Byte())
context.Response.ContentType = "application/pdf"
context.Response.OutputStream.Write(buffer, 0, buffer.Length)
context.Response.Flush()
And this works fine to display the BLOB as a pdf using whichever pdf plugin is installed on any given browser.
My Question: How can I modify the handler to write Adobe PDF specific parameters to the output? Specifically I'm trying to set width='fit' such that the output PDF stream will autofit the document to the width of the popup window.
NB: Writing the BLOB to a pdf file and serving the PDF is not an option.
Thanks in advance for any advice or links
I don't think there's anything that you can do in your handler. According to that document PDF viewers can examine the URL that was used to open the PDF but there are no HTTP headers that you can set. So you'll need to modify the thing that links to your handler to have those parameters in place. Alternatively, you could build a pre-handler that HTTP redirects to your new handler with those parameters in place.
Also, that document was written in 2007 and was intended for Adobe Acrobat and Adobe Reader. Most modern browsers ship with their own internal PDF viewer these days so unless you are only targeting Adobe your efforts might be wasted.

Can I test the validity of an image file before uploading it in ASP.NET?

I have an ASP.NET web application that allows the user to upload a file from his PC to a SQL Server database (which is later used to generate an image for an tag). Is there an "easy" way to test the image within .NET to validate that it does not contain anything malicious before saving it?
Right now, I use this:
MemoryStream F = new MemoryStream();
Bitmap TestBitmap = new Bitmap(Filename);
TestBitmap.Save(F, System.Drawing.Imaging.ImageFormat.Png);
int PhotoSize = (int)F.Length;
Photo = new byte[PhotoSize];
F.Seek(0, SeekOrigin.Begin);
int BytesRead = F.Read(Photo, 0, PhotoSize);
F.Close();
Creating TestBitmap fails if it is not an image (e.g. if Filename is the name of a text file), but apparently this doesn't stop a file that is an image with malicious code appended to it from loading as an image, so saving it as a MemoryStream and then writing the stream to a byte array (which is later saved in the database) supposedly fixes this.
To avoid people pass programs and other information's using the ability to upload photos to your site you can do two main steps.
Read and save again the image with your code to remove anything elst.
Limit the size of each image to a logical number.
To avoid some one upload bad code and run it on your server you keep an isolate folder with out permission to run anything. More information's about that on:
I've been hacked. Evil aspx file uploaded called AspxSpy. They're still trying. Help me trap them‼
And a general topic on the same subject: Preparing an ASP.Net website for penetration testing

SSRS get meta data of remote report

How can I retrieve the meta data such as Description, Modified/Create Dates etc from a Remote SSRS report. The report itself displays no problems in the ReportViewer control on the aspx page so I can access the report...
there doesn't seem to be any properties for those values in the .ServerReport object...
thanks heaps!
There are a couple of ways, one way is to add a web reference to the web services interface of your reporting server and call the GetReportDefinition method. more information here:
http://msdn.microsoft.com/en-us/library/aa258101(SQL.80).aspx
The code could look like this:
ReportingService reportingService = new ReportingService();
XmlDocument xmlDocument = null;
byte[] reportDefinition = reportingService.GetReportDefinition(ReportName);
using (MemoryStream memoryStream = new MemoryStream(reportDefinition))
{
xmlDocument = new XmlDocument();
xmlDocument.Load(memoryStream);
}
This gets your .rdl file that you can parse using the XML tools. You can also call the tables in the SSRS database via SQL/ADO/Linq to get the information you are after:
Some good examples of T-SQL against the reporting service database:
http://www.purplefrogsystems.com/blog/?p=13
All of the information you are after might not be in a single spot, for example, some might be in the .rdl, and some in the SQL Server database.
{6230289B-5BEE-409e-932A-2F01FA407A92}

Resources