I need to merge multiple pdf files into one pdf and display it in my web browser.
I know how to display one file :
File file = new File(activite.getLienUploadUn());
FileInputStream inputStream = new FileInputStream(file);
byte[] buffer = new byte[8192];
ByteArrayOutputStream baos = new ByteArrayOutputStream();
int bytesRead;
while ((bytesRead = inputStream.read(buffer)) != -1)
{
baos.write(buffer, 0, bytesRead);
}
response.setHeader("Content-Disposition","inline; filename=\""+file.getName()+"\"");
response.setContentType("application/pdf");
ServletOutputStream outputStream = response.getOutputStream();
baos.writeTo(outputStream);
outputStream.flush();
I think I am going to use PdfBox and its PDFMergerUtility class to merge files :
PDFMergerUtility mergePdf = new PDFMergerUtility();
mergePdf.addSource(file);
mergePdf.mergeDocuments(MemoryUsageSetting.setupMainMemoryOnly());
But from there how can I convert the merged document into a byteArrayOutputStream?
You can call PDFMergerUtility.setDestinationStream(OutputStream destStream) to pass an output stream (javadoc).
Related
I currently have a PdfReader and a PdfStamper that I am filling out the acrofields with. I now have to copy another pdf to the end of that form I have been filling out and when I do I lose the acrofield on the new form I copy over. Here is the code.
public static void addSectionThirteenPdf(PdfStamper stamper, Rectangle pageSize, int pageIndex){
PdfReader reader = new PdfReader(FacesContext.getCurrentInstance().getExternalContext().getResourceAsStream("/resources/documents/Section13.pdf"));
AcroFields fields = reader.getAcroFields();
fields.renameField("SecurityGuidancePage3", "SecurityGuidancePage" + pageIndex);
stamper.insertPage(pageIndex, pageSize);
stamper.replacePage(reader, 1, pageIndex);
}
The way that I am creating the original document is like this.
OutputStream output = FacesContext.getCurrentInstance().getExternalContext().getResponseOutputStream();
PdfReader pdfTemplate = new PdfReader(FacesContext.getCurrentInstance().getExternalContext().getResourceAsStream("/resources/documents/dd254.pdf"));
PdfStamper stamper = new PdfStamper(pdfTemplate, output);
stamper.setFormFlattening(true);
AcroFields fields = stamper.getAcroFields();
Is there a way to merge using the first piece of code and merge both of the acrofields together?
Depending on what you want exactly, different scenarios are possible, but in any case: you are doing it wrong. You should use either PdfCopy or PdfSmartCopy to merge documents.
The different scenarios are explained in the following video tutorial.
You can find most of the examples in the iText sandbox.
Merging different forms (having different fields)
If you want to merge different forms without flattening them, you should use PdfCopy as is done in the MergeForms example:
public void createPdf(String filename, PdfReader[] readers) throws IOException, DocumentException {
Document document = new Document();
PdfCopy copy = new PdfCopy(document, new FileOutputStream(filename));
copy.setMergeFields();
document.open();
for (PdfReader reader : readers) {
copy.addDocument(reader);
}
document.close();
for (PdfReader reader : readers) {
reader.close();
}
}
In this case, readers is an array of PdfReader instances containing different forms (with different field names), hence we use PdfCopy and we make sure that we don't forget to use the setMergeFields() method, or the fields won't be copied.
Merging identical forms (having identical fields)
In this case, we need to rename the fields, because we probably want different values on different pages. In PDF a field can only have a single value. If you merge identical forms, you have multiple visualizations of the same field, but each visualization will show the same value (because in reality, there is only one field).
Let's take a look at the MergeForms2 example:
public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
Document document = new Document();
PdfCopy copy = new PdfSmartCopy(document, new FileOutputStream(dest));
copy.setMergeFields();
document.open();
List<PdfReader> readers = new ArrayList<PdfReader>();
for (int i = 0; i < 3; ) {
PdfReader reader = new PdfReader(renameFields(src, ++i));
readers.add(reader);
copy.addDocument(reader);
}
document.close();
for (PdfReader reader : readers) {
reader.close();
}
}
public byte[] renameFields(String src, int i) throws IOException, DocumentException {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader, baos);
AcroFields form = stamper.getAcroFields();
Set<String> keys = new HashSet<String>(form.getFields().keySet());
for (String key : keys) {
form.renameField(key, String.format("%s_%d", key, i));
}
stamper.close();
reader.close();
return baos.toByteArray();
}
As you can see, the renameFields() method creates a new document in memory. That document is merged with other documents using PdfSmartCopy. If you'd use PdfCopy here, your document would be bloated (as we'll soon find out).
Merging flattened forms
In the FillFlattenMerge1, we fill out the forms using PdfStamper. The result is a PDF file that is kept in memory and that is merged using PdfCopy. While this example is fine if you'd merge different forms, this is actually an example on how not to do it (as explained in the video tutorial).
The FillFlattenMerge2 shows how to merge identical forms that are filled out and flattened correctly:
public void manipulatePdf(String src, String dest) throws DocumentException, IOException {
Document document = new Document();
PdfCopy copy = new PdfSmartCopy(document, new FileOutputStream(dest));
document.open();
ByteArrayOutputStream baos;
PdfReader reader;
PdfStamper stamper;
AcroFields fields;
StringTokenizer tokenizer;
BufferedReader br = new BufferedReader(new FileReader(DATA));
String line = br.readLine();
while ((line = br.readLine()) != null) {
// create a PDF in memory
baos = new ByteArrayOutputStream();
reader = new PdfReader(SRC);
stamper = new PdfStamper(reader, baos);
fields = stamper.getAcroFields();
tokenizer = new StringTokenizer(line, ";");
fields.setField("name", tokenizer.nextToken());
fields.setField("abbr", tokenizer.nextToken());
fields.setField("capital", tokenizer.nextToken());
fields.setField("city", tokenizer.nextToken());
fields.setField("population", tokenizer.nextToken());
fields.setField("surface", tokenizer.nextToken());
fields.setField("timezone1", tokenizer.nextToken());
fields.setField("timezone2", tokenizer.nextToken());
fields.setField("dst", tokenizer.nextToken());
stamper.setFormFlattening(true);
stamper.close();
reader.close();
// add the PDF to PdfCopy
reader = new PdfReader(baos.toByteArray());
copy.addDocument(reader);
reader.close();
}
br.close();
document.close();
}
These are three scenarios. Your question is too unclear for anyone but you to decide which scenario is the best fit for your needs. I suggest that you take the time to learn before you code. Watch the video, try the examples, and if you still have doubts, you'll be able to post a smarter question.
I'd like to load an image directly from a URL but without saving it on the server, I want to upload it directly from memory to Amazon S3 server.
This is my code:
Dim wc As New WebClient
Dim fileStream As IO.Stream = wc.OpenRead("http://www.domain.com/image.jpg")
Dim request As New PutObjectRequest()
request.BucketName = "mybucket"
request.Key = "file.jpg"
request.InputStream = fileStream
client.PutObject(request)
The Amazon API gives me the error "Could not determine content length". The stream fileStream ends up as "System.Net.ConnectStream" which I'm not sure if it's correct.
The exact same code works with files from the HttpPostedFile but I need to use it in this way now.
Any ideas how I can convert the stream to become what Amazon API is expecting (with the length intact)?
I had the same problem when I'm using the GetObjectResponse() method and its propertie ResponseStream to copy a file from a folder to another in same bucket. I noted that the AWS SDK (2.3.45) have some faults like a another method called WriteResponseStreamToFile in GetObjectResponse() that simply doesn't work. These lacks of functions needs some workarounds.
I solved the problem openning the file in array of bytes and putting it in a MemoryStream object.
Try this (C# code)
WebClient wc = new WebClient();
Stream fileStream = wc.OpenRead("http://www.domain.com/image.jpg");
byte[] fileBytes = fileStream.ToArrayBytes();
PutObjectRequest request = new PutObjectRequest();
request.BucketName = "mybucket";
request.Key = "file.jpg";
request.InputStream = new MemoryStream(fileBytes);
client.PutObject(request);
The extesion method
public static byte[] ToArrayBytes(this Stream input)
{
byte[] buffer = new byte[16 * 1024];
using (MemoryStream ms = new MemoryStream())
{
int read;
while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
{
ms.Write(buffer, 0, read);
}
return ms.ToArray();
}
}
You can also create a MemoryStream without an array of bytes. But after the first PutObject in S3, the MemoryStream will be discarted. If you need to put others objects, I recommend the first option
WebClient wc = new WebClient();
Stream fileStream = wc.OpenRead("http://www.domain.com/image.jpg");
MemoryStream fileMemoryStream = fileStream.ToMemoryStream();
PutObjectRequest request = new PutObjectRequest();
request.BucketName = "mybucket";
request.Key = "file.jpg";
request.InputStream = fileMemoryStream ;
client.PutObject(request);
The extesion method
public static MemoryStream ToMemoryStream(this Stream input)
{
byte[] buffer = new byte[16 * 1024];
int read;
MemoryStream ms = new MemoryStream();
while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
{
ms.Write(buffer, 0, read);
}
return ms;
}
I had the same problem in a similar scenario.
The reason for the error is that to upload an object the SDK needs to know the whole content length that is going to be uploaded. To be able to obtain stream length it must be seekable, but the stream returned from WebClient is not. To indicate the expected length set Headers.ContentLength in PutObjectRequest. The SDK will use this value if it cannot determine length from the stream object.
To make your code work, obtain content length from the response headers returned by the call made by WebClient. Then set PutObjectRequest.Headers.ContentLength. Of course this relies on the server returned content length value.
Dim wc As New WebClient
Dim fileStream As IO.Stream = wc.OpenRead("http://www.example.com/image.jpg")
Dim contentLength As Long = Long.Parse(client.ResponseHeaders("Content-Length"))
Dim request As New PutObjectRequest()
request.BucketName = "mybucket"
request.Key = "file.jpg"
request.InputStream = fileStream
request.Headers.ContentLength = contentLength
client.PutObject(request)
I came up with a solution that uses UploadPart when the length is not available by any other means, plus this does not load the entire file into memory.
if (args.DocumentContents.CanSeek)
{
PutObjectRequest r = new PutObjectRequest();
r.InputStream = args.DocumentContents;
r.BucketName = s3Id.BucketName;
r.Key = s3Id.ObjectKey;
foreach (var item in args.CustomData)
{
r.Metadata[item.Key] = item.Value;
}
await S3Client.PutObjectAsync(r);
}
else
{
// if stream does not allow seeking, S3 client will throw error:
// Amazon.S3.AmazonS3Exception : Could not determine content length
// as a work around, if cannot use length property, will chunk
// file into sections and use UploadPart, so do not have to load
// entire file into memory as a single MemoryStream.
var r = new InitiateMultipartUploadRequest();
r.BucketName = s3Id.BucketName;
r.Key = s3Id.ObjectKey;
foreach (var item in args.CustomData)
{
r.Metadata[item.Key] = item.Value;
}
var multipartResponse = await S3Client.InitiateMultipartUploadAsync(r);
try
{
var completeRequest = new CompleteMultipartUploadRequest
{
UploadId = multipartResponse.UploadId,
BucketName = s3Id.BucketName,
Key = s3Id.ObjectKey,
};
// just using this size, because it is the max for Azure File Share, but it could be any size
// for S3, even a configured value
const int blockSize = 4194304;
// BinaryReader gives us access to ReadBytes
using (var reader = new BinaryReader(args.DocumentContents))
{
var partCounter = 1;
while (true)
{
byte[] buffer = reader.ReadBytes(blockSize);
if (buffer.Length == 0)
break;
using (MemoryStream uploadChunk = new MemoryStream(buffer))
{
uploadChunk.Position = 0;
var uploadRequest = new UploadPartRequest
{
BucketName = s3Id.BucketName,
Key = s3Id.ObjectKey,
UploadId = multipartResponse.UploadId,
PartNumber = partCounter,
InputStream = uploadChunk,
};
// could call UploadPart on multiple threads, instead of using await, but that would
// cause more data to be loaded into memory, which might be too much
var part2Task = await S3Client.UploadPartAsync(uploadRequest);
completeRequest.AddPartETags(part2Task);
}
partCounter++;
}
var completeResponse = await S3Client.CompleteMultipartUploadAsync(completeRequest);
}
}
catch
{
await S3Client.AbortMultipartUploadAsync(s3Id.BucketName, s3Id.ObjectKey
, multipartResponse.UploadId);
throw;
}
}
For a web application built on struts and jsp technologies, I'm looking for a good example which explains how to download files from the server side.
i manage to do it with this few lines of code :
just add this to your action :
OutputStream out = response.getOutputStream();
response.setContentType("application/rtf");
FileInputStream in = new FileInputStream("your_file_path");
byte[] buffer = new byte[4096];
int length;
while ((length = in.read(buffer)) > 0){
out.write(buffer, 0, length);
}
in.close();
out.flush();
I have the pdf file location and pdf file in my POJO class. I want to download thee pdf using servlet. Please tell me some ways to get it done.
File Location=/tmp/SWBC_444Thu May 03 20:01:07 IST 20124366242221752147545.pdf
Using this file location i want to prompt user to download the file as pdf.
Here is my code.
File file = new File(filePath);
OutputStream responseOutputStream = response.getOutputStream();
response.setContentLength((int)filePath.length());
FileInputStream fileInputStream = new FileInputStream(file);
int size = fileInputStream.available();
byte[] content = new byte[size];
int bytesRead;
while ((bytesRead = fileInputStream.read(content)) != -1)
{
responseOutputStream.write(content, 0, bytesRead);
}
responseOutputStream.flush();
fileInputStream.close();
responseOutputStream.close();
. I read and generate the file but when open the file its empty.
Thanking you..!
httpservletresponse.setHeader("Content-disposition", "attachment; filename=\"" + title + ".pdf\""); should do
I have created a custom pipeline component which transforms a complex excel spreadsheet to XML. The transformation works fine and I can write out the data to check. However when I assign this data to the BodyPart.Data part of the inMsg or a new message I always get a routing failure. When I look at the message in the admin console it appears that the body contains binary data (I presume the original excel) rather than the XML I have assigned - see screen shot below. I have followed numerous tutorials and many different ways of doing this but always get the same result.
My current code is:
public Microsoft.BizTalk.Message.Interop.IBaseMessage Execute(Microsoft.BizTalk.Component.Interop.IPipelineContext pc, Microsoft.BizTalk.Message.Interop.IBaseMessage inmsg)
{
//make sure we have something
if (inmsg == null || inmsg.BodyPart == null || inmsg.BodyPart.Data == null)
{
throw new ArgumentNullException("inmsg");
}
IBaseMessagePart bodyPart = inmsg.BodyPart;
//create a temporary directory
const string tempDir = #"C:\test\excel";
if (!Directory.Exists(tempDir))
{
Directory.CreateDirectory(tempDir);
}
//get the input filename
string inputFileName = Convert.ToString(inmsg.Context.Read("ReceivedFileName", "http://schemas.microsoft.com/BizTalk/2003/file-properties"));
swTemp.WriteLine("inputFileName: " + inputFileName);
//set path to write excel file
string excelPath = tempDir + #"\" + Path.GetFileName(inputFileName);
swTemp.WriteLine("excelPath: " + excelPath);
//write the excel file to a temporary folder
bodyPart = inmsg.BodyPart;
Stream inboundStream = bodyPart.GetOriginalDataStream();
Stream outFile = File.Create(excelPath);
inboundStream.CopyTo(outFile);
outFile.Close();
//process excel file to return XML
var spreadsheet = new SpreadSheet();
string strXmlOut = spreadsheet.ProcessWorkbook(excelPath);
//now build an XML doc to hold this data
XmlDocument xDoc = new XmlDocument();
xDoc.LoadXml(strXmlOut);
XmlDocument finalMsg = new XmlDocument();
XmlElement xEle;
xEle = finalMsg.CreateElement("ns0", "BizTalk_Test_Amey_Pipeline.textXML",
"http://tempuri.org/INT018_Workbook.xsd");
finalMsg.AppendChild(xEle);
finalMsg.FirstChild.InnerXml = xDoc.FirstChild.InnerXml;
//write xml to memory stream
swTemp.WriteLine("Write xml to memory stream");
MemoryStream streamXmlOut = new MemoryStream();
finalMsg.Save(streamXmlOut);
streamXmlOut.Position = 0;
inmsg.BodyPart.Data = streamXmlOut;
pc.ResourceTracker.AddResource(streamXmlOut);
return inmsg;
}
Here is a sample of writing the message back:
IBaseMessage Microsoft.BizTalk.Component.Interop.IComponent.Execute(IPipelineContext pContext, IBaseMessage pInMsg)
{
IBaseMessagePart bodyPart = pInMsg.BodyPart;
if (bodyPart != null)
{
using (Stream originalStrm = bodyPart.GetOriginalDataStream())
{
byte[] changedMessage = ConvertToBytes(ret);
using (Stream strm = new AsciiStream(originalStrm, changedMessage, resManager))
{
// Setup the custom stream to put it back in the message.
bodyPart.Data = strm;
pContext.ResourceTracker.AddResource(strm);
}
}
}
return pInMsg;
}
The AsciiStream used a method like this to read the stream:
override public int Read(byte[] buffer, int offset, int count)
{
int ret = 0;
int bytesRead = 0;
byte[] FixedData = this.changedBytes;
if (FixedData != null)
{
bytesRead = count > (FixedData.Length - overallOffset) ? FixedData.Length - overallOffset : count;
Array.Copy(FixedData, overallOffset, buffer, offset, bytesRead);
if (FixedData.Length == (bytesRead + overallOffset))
this.changedBytes = null;
// Increment the overall offset.
overallOffset += bytesRead;
offset += bytesRead;
count -= bytesRead;
ret += bytesRead;
}
return ret;
}
I would first of all add more logging to your component around the MemoryStream logic - maybe write the file out to the file system so you can make sure the Xml version is correct. You can also attach to the BizTalk process and step through the code for the component which makes debugging a lot easier.
I would try switching the use of MemoryStream to a more basic custom stream that writes the bytes for you. In the BizTalk SDK samples for pipeline components there are some examples for a custom stream. You would have to customize the stream sample so it just writes the stream. I can work on posting an example. So do the additional diagnostics above first.
Thanks,