I'm trying to to convert aspx pages to PDF.
The two main issues I've encountered are:
Some of those pages contains gis elements (mostly google maps, but some may be municipal maps). When the user changes their position - I want it to be converted properly. Right now I couldn't event convert their default position to pdf.
The text is in Hebrew and I'm having hard time converting it.
I've tried using jsPDF - used their addHTML function and looked at their runner example (which is using iframe - but doesn't seem to work on explorer, or with maps)
Does anyone have any other idea as to how I can convert this? Maybe convert the page to jpeg and then to convert it?
If you have working samples - that would be excellent.
You can try using IECapt (http://iecapt.sourceforge.net/) to convert it to JPEG and then convert it from JPEG to PDF.
for example:
IECapt --url=http://www.page_you_want_to_render.org/ --out=the_render.jpeg
Related
We are downloaded map.mbtiles from openmaptiles.com.
Now we are tyring to convert that map.mbtiles to png images.
We tried mbutil to convert but images we got those are not supported.
We need method or process to convert it.
The easiest way would be using the Tileserver-GL to render png raster tiles from your mbtiles. Documentation is avaible here: https://tileserver.readthedocs.io/en/latest/
I read my hyperspectral (.raw) file and combine three bands to "gai_out_r" Then I output as following:
writeRaster(gai_out_r,filepath,format="GTiff")
finally I got gai_out_r.tif
But, why Win10 can't display this small tif as the pic that I output the same way from envi--save image as--tif
Two tiffs are displayed by Win10 as following:
Default windows image viewing applications doesn't support Hyperspectral Images-since you are just reading and combining 3 bands from your .raw file, the resulting image will be a hyperspectral image.You need to have separate dedicated softwares to view hypercubes or can view it using spectral-python also.
In sPy, using envi.save_image , will save it as a ENVI type file only. To save it as an rgb image file(readable in windows OS) we need to use other methods.
You are using writeRaster to write to a GTiff (GeoTiff) format file. To write to a standard tif file you can use the tiff method. With writeRaster you could also write to a PNG instead
writeRaster(gai_out_r, "gai.png")
Cause of the issue:
I had a similar issue and recognised that the exported .tif files had a different bit depth than .tif images I could open. The images could not be displayed using common applications, although they were not broken and I could open them in R or QGIS. Hence, the values were coded in a way Windows would not expect.
When you type ?writeRaster() you will find that there are various options when it comes to saving a .tif (or other format) using the raster::writeRaster() function. Click on the links therein to get to the dataType {raster} help site and you'll find there are various integer types to choose from.
Solution (write a Windows-readable GeoTIFF):
I set the following options to make the resulting .tif file readable (note the datatype option):
writeRaster(raster, filename = "/path/to/your/output.tif",
format = "GTiff", datatype = "INT1U")
Note:
I realised your post is from 2 and a half years ago... Anyways, may this answer help others who encounter this problem.
I have around 10000 pdf files in which i have to do page and text formatting like add margins to page, convert all the text to particular font and size,etc and merge the files.
I searched a lot on google but could not find anything.
Please if someone can help with any option in R or SAS to perform this task.
I'm currently attempting to convert some PCL files into PDF using GhostPCL (PCL6).
For the most part this works. However, there is an odd problem with some of the conversion. For some reason, PCL6 is not converting some logos where are at the top of our documents. The logo is of the format:
^[(25XABCDEFGHIJKLMNOPQ^[(3#^M
^[(25X^[&a+1.49RRSTUVWXYZ[\]^_`ab^[(3#^M
^[(25X^[&a+1.49Rcdefghijklmnopqrs^M
when viewing the PCL file in vim. When printing the file as a PCL file, the image prints out correctly, but when converting to pdf, the following takes it's place:
ABCDEFGHIJKLMNOPQ
RSTUVWXYZ[\]^_`ab
cdefghijklmnopqrs
I recognize that the format is meant to be matched against some sort of embedded image or font, but it has been really difficult trying to find useful documentation on PCL (so I can actually figure out what these characters mean) or the conversion process.
Can anyone offer some insight on how to approach the conversion? We will need these images/logos in the converted documents since they often contain disclaimer information as part of the image.
EDIT1: I've also attempted converting to postscript and printing then and the same behavior occurs.
EDIT2: When rendering the PCL file in a viewer, the same text shows up instead of the image. But when printing, the logo does show up. Strange...
EDIT3: To clarify, sending the PCL file to a printer directly does not seem to cause the problem (i.e, the logo does print correctly). It's only when I attempt to convert it to another file format that the problem occurs.
What happens when you try rendering the PCL input with Ghostscript ? Eg to the display device. If it doesn't render its not going to end up in a PDF either.
Have you tried printing the file to a PCL printer ?
If it works to a PCL printer, but not when rendering you can open a bug against ghostpcl. If it renders but does not end up in the PDF then you can open a bug against ghostspcl with the 'pdf writer' component.
Its possible that the logo is shown using a rasterop, this is a part of the PCL imaging model which has no counterpart in PDF and so cannot be reproduced. The result of using a rasterop with the PDF device is variable, sometimes it will do what you expect, often it will not.
A friend of mine doing an internship asked me 2 hours ago if I could help him avoid to do manually 462 pdf file to .xls using free online soft.
I thought of a shell script using unoconv, but I didn't find out how to use it properly, and I am not sure if unoconv can solve this problem since it mainly converts file to pdf, not the reverse thing.
Conversion from PDF to any other structured format is not always possible and not generally recommended.
Having said that, this does look like a one-off job and there's a fair few of them (462).
It's worth pursuing, if you can reliably extract text from most of them and it's reasonably structured. It's a matter of trying to get regular text output across a sample of the PDF's that you can reliably parse into a table structure.
There's plenty of tools around that target either direct or OCR based text extraction, just google around.
One I like is pstotext from the ghostscript suite; the -bboxes option lets me get the coordinates of each word and leaves it up to me to re-assemble the structure. Despite its name it does work on input PDFs. Downside is that it can be a bit flakey and works on some PDF's but not others.
If you get this far, you'd then most likely then need to write a shell-script or program to convert that to a CSV. You can either open this directly via a spread-sheet or look for tools to convert this into XLS.
PS If he hasn't already, get the intern to ask if there's any possible way of getting at the original data that was used to created the PDFs It will save a lot of time and effort and lead to a way more accurate result.
Update An alternative to pstotext is renderpdf.pl command which is included in the Perl CAM::PDF module. More robust, but just reports text (x,y) position, not bounding boxes.
Other responses on a linked question suggest Tabula, too.
https://github.com/tabulapdf/tabula
I tried and it works very well.