java8 ImageIO does not support webp image format - javax.imageio

I need to create thumbnail starting from a WebP image but ImageIO doesn't support this format. Are there any library that allow me to do something like this ?
String format = getImageFormat(imageFile);
Iterator readers = ImageIO.getImageReadersByFormatName(format);
// rescaling the image
BufferedImage bi = loadImageRescalingIfNeeded(imageFile, metadata,...);
//resample if needed
bi = resampleImageIfNeeded(bi, thumbWidth, thumbHeight);
// rotate if degree > 0
bi = rotateBufferedImage(bi, degree);
// create jpeg in output
try (ImageOutputStream imageOut = ImageIO.createImageOutputStream(fileOutputStream)){
try {
ImageWriter writer = ImageIO.getImageWritersBySuffix("jpeg").next();
ImageWriteParam iwp = writer.getDefaultWriteParam();
iwp.setProgressiveMode(ImageWriteParam.MODE_DEFAULT);
writer.setOutput(imageOut);
writer.write(null, new IIOImage(bi, null, metadata), iwp);
} catch (Exception e) {...}
....}

Related

Extract text with iText not works: encoding or crypted text?

I have a pdf file that as the follow security properties: printing: allowed; document assembly: NOT allowed; content copy: allowed; content copy for accessibility: allowed; page extraction:NOT allowed;
I try to get text with sample code as documentation sample as follow:
pdftext.Text = null;
StringBuilder text = new StringBuilder();
PdfReader pdfReader = new PdfReader(filename);
for (int page = 1; page <= pdfReader.NumberOfPages; page++)
{
ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
string currentText = PdfTextExtractor.GetTextFromPage(pdfReader, page, strategy);
text.Append(System.Environment.NewLine);
text.Append("\n Page Number:" + page);
text.Append(System.Environment.NewLine);
currentText = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(currentText)));
text.Append(currentText);
progressBar1.Value++;
}
pdftext.Text += text.ToString();
pdfReader.Close();
but the output text is lines with ""??? ? ???????\n?? ??? ? " values;
seems that file is crypted or we have a encoding problem...
note that in the follow lines
var f = pdfReader.IsOpenedWithFullPermissions; -> FALSE
var f1 = pdfReader.IsEncrypted(); - > FALSE
var f2 = pdfReader.ComputeUserPassword(); - > NULL
var f3 = pdfReader.Is128Key(); - > FALSE
var f4 = pdfReader.HasUsageRights();
f, f1, f3, f4 return FALSE ...than seems that the document is not crypted,
...so I don't know if is a Encoding problem or question related to encrypet strings...
Someone can help me?
thanks in advance.
G.G.
Whenever you have trouble extracting text from a document using standard code, the first thing to do is try and copy&paste the text from it using Adobe Acrobat Reader. Adobe Reader copy&paste implements text extraction according to the recommendations of the PDF specification, and if this fails, this usually means that the necessary information required for text extraction in the document are either missing or broken (by accident or by design). To extract the text, one either needs to customize the code specifically to the specific PDF or resort to OCR.
In case of the document at hand, Adobe Reader copy&paste does result in garbage, too, just like when extracting with iText. Thus, there is something fishy in the document.
Inspecting the document one finds that the fonts contain ToUnicode mappings like this:
/CIDInit /ProcSet
findresource begin 12 dict begin begincmap /CIDSystemInfo<</Registry(Adobe)
/Ordering(Identity)
/Supplement 0
>>
def
/CMapName/F18 def
1 begincodespacerange <0000> <FFFF> endcodespacerange
44 beginbfrange
<20> <20> <0020>
<21> <21> <E0F9>
<22> <22> <E0F1>
<23> <23> <E0FA>
<24> <24> <E0F7>
<25> <25> <E0A3>
<26> <26> <E084>
<27> <27> <E097>
<28> <28> <E098>
<29> <29> <E09A>
<2A> <2A> <E08A>
<2B> <2B> <E099>
<2C> <2C> <E0A5>
<2D> <2D> <E086>
<2E> <2E> <E094>
<2F> <2F> <E0DE>
<30> <30> <E0A6>
<31> <31> <E096>
<32> <32> <E088>
<33> <33> <E082>
<34> <34> <E04C>
<35> <35> <E0A4>
<36> <36> <E0F6>
<37> <37> <E0F2>
<38> <38> <E0D8>
<39> <39> <E0AA>
<3A> <3A> <E06C>
<3B> <3B> <E087>
<3C> <3C> <E095>
<3D> <3D> <E0C4>
<3E> <3E> <E07E>
<3F> <3F> <E055>
<40> <40> <E089>
<41> <41> <E085>
<42> <42> <E083>
<43> <43> <E070>
<44> <44> <E0E6>
<45> <45> <E080>
<46> <46> <E0C8>
<47> <47> <E0F4>
<48> <48> <E062>
<49> <49> <E0F3>
<4A> <4A> <E04E>
<4B> <4B> <E05E>
endbfrange
endcmap CMapName currentdict /CMap defineresource pop end end
I.e., if you are not into this, the fonts claim that all their glyphs (with the exception of the space glyph at 0x20) represent characters U+E0xx from the Unicode private use area. As the name of that area indicates, there is no common meaning of characters with these values.
Thus, text extraction according to the PDF specification will return strings of characters with undefined meaning with results as you observed in iText or I saw in Adobe Reader.
Sometimes in such a situation one can still enforce proper text extraction by ignoring the ToUnicode map and using either the font Encoding or information inside the embedded font program.
Unfortunately it turns out that here the Encoding effectively contains the same information as does the ToUnicode map, e.g. for the same font as above
/Differences [ 32 /space /uniE0F9 /uniE0F1 /uniE0FA /uniE0F7 /uniE0A3 /uniE084 /uniE097 /uniE098
/uniE09A /uniE08A /uniE099 /uniE0A5 /uniE086 /uniE094 /uniE0DE /uniE0A6 /uniE096
/uniE088 /uniE082 /uniE04C /uniE0A4 /uniE0F6 /uniE0F2 /uniE0D8 /uniE0AA /uniE06C
/uniE087 /uniE095 /uniE0C4 /uniE07E /uniE055 /uniE089 /uniE085 /uniE083 /uniE070
/uniE0E6 /uniE080 /uniE0C8 /uniE0F4 /uniE062 /uniE0F3 /uniE04E /uniE05E ]
and the fonts turns out to be Type3 fonts, i.e. there is no embedded font program but each glyph is defined as an individual PDF canvas without further character information.
Thus, nothing to gain here either.
Actually these small PDF canvasses contain inlined bitmap graphics of the respective glyph which also is the cause of the poor graphical quality of the document (if you don't see that immediately, simply zoom in a bit and you'll see the ragged outlines of the glyphs).
By the way, such a construct usually means that the producer of the PDF explicitly wants to prevent text extraction.
If you happen to have to extract text from many such documents, you can try and determine a mapping from their U+E0xx characters to actually sensible Unicode characters and apply that mapping to your extracted text.
If all those fonts in all those documents happen to use the same U+E0xx codepoints for the same actual characters, you'll be able to do text extraction from those documents after investing a certain amount of initial work.
Otherwise do try OCR.
The following code adds pages to a document which map the ToUnicode values to the characters shown:
void AddFontsTo(PdfReader reader, PdfStamper stamper)
{
int documentPages = reader.NumberOfPages;
for (int page = 1; page <= documentPages; page++)
{
// ignore inherited resources for now
PdfDictionary pageResources = reader.GetPageResources(page);
if (pageResources == null)
continue;
PdfDictionary pageFonts = pageResources.GetAsDict(PdfName.FONT);
if (pageFonts == null || pageFonts.Size == 0)
continue;
List<BaseFont> fonts = new List<BaseFont>();
List<string> fontNames = new List<string>();
HashSet<char> chars = new HashSet<char>();
foreach (PdfName key in pageFonts.Keys)
{
PdfIndirectReference fontReference = pageFonts.GetAsIndirectObject(key);
if (fontReference == null)
continue;
DocumentFont font = (DocumentFont) BaseFont.CreateFont((PRIndirectReference)fontReference);
if (font == null)
continue;
PdfObject toUni = PdfReader.GetPdfObjectRelease(font.FontDictionary.Get(PdfName.TOUNICODE));
CMapToUnicode toUnicodeCmap = null;
if (toUni is PRStream)
{
try
{
byte[] touni = PdfReader.GetStreamBytes((PRStream)toUni);
CidLocationFromByte lb = new CidLocationFromByte(touni);
toUnicodeCmap = new CMapToUnicode();
CMapParserEx.ParseCid("", toUnicodeCmap, lb);
}
catch
{
toUnicodeCmap = null;
}
}
if (toUnicodeCmap == null)
continue;
ICollection<int> mapValues = toUnicodeCmap.CreateDirectMapping().Values;
if (mapValues.Count == 0)
continue;
fonts.Add(font);
fontNames.Add(key.ToString());
foreach (int value in mapValues)
chars.Add((char)value);
}
if (fonts.Count == 0 || chars.Count == 0)
continue;
Rectangle size = (fonts.Count > 10) ? PageSize.A4.Rotate() : PageSize.A4;
PdfPTable table = new PdfPTable(fonts.Count + 1);
table.AddCell("Page " + page);
foreach (String name in fontNames)
{
table.AddCell(name);
}
table.HeaderRows = 1;
float[] widths = new float[fonts.Count + 1];
widths[0] = 2;
for (int i = 1; i <= fonts.Count; i++)
widths[i] = 1;
table.SetWidths(widths);
table.WidthPercentage = 100;
List<char> charList = new List<char>(chars);
charList.Sort();
foreach (char character in charList)
{
table.AddCell(((int)character).ToString("X4"));
foreach (BaseFont font in fonts)
{
table.AddCell(new PdfPCell(new Phrase(character.ToString(), new Font(font))));
}
}
stamper.InsertPage(reader.NumberOfPages + 1, size);
ColumnText columnText = new ColumnText(stamper.GetUnderContent(reader.NumberOfPages));
columnText.AddElement(table);
columnText.SetSimpleColumn(size);
while ((ColumnText.NO_MORE_TEXT & columnText.Go(false)) == 0)
{
stamper.InsertPage(reader.NumberOfPages + 1, size);
columnText.Canvas = stamper.GetUnderContent(reader.NumberOfPages);
columnText.SetSimpleColumn(size);
}
}
}
I applied it to your document like this:
string input = #"4700198773.pdf";
string output = #"4700198773-fonts.pdf";
using (PdfReader reader = new PdfReader(input))
using (FileStream stream = new FileStream(output, FileMode.Create, FileAccess.Write))
using (PdfStamper stamper = new PdfStamper(reader, stream))
{
AddFontsTo(reader, stamper);
}
The additional pages look like this:
Now you have to compare the outputs for the different fonts and pages of this document with each other and with those of a representative selection of file. If you find good enough a pattern, you can try this replacement way.

Image files dosen't print correctly in windows fax service

I've written some codes to send fax using faxcomlib. it work fine when running code in my windows application but i want to use my sending fax code in a windows service,the problem is go here when i try to send fax by using windows service, my problem is that when i sending text files,or pdf or word document it work fine,but when sending any image format like .jpg,.tiff,... i face to operation failed error and my service get hanged,i try this ways but unfortunately i didn't get any right anwser :
1- i get all permissions to my service and related folders
2- i try to convert images to the same format and size that fax printer use it ( CCITT T6 - 1740*2400)
3- i try to convert my images to pdf files and then send it but i cann't, despite already i did send pdf files without any errors.
and here is my codes for sendig fax :
if (ImageExtensions.Contains(System.IO.Path.GetExtension(tempFileName).ToUpperInvariant().Trim()))
{
string newFileName = System.IO.Path.GetDirectoryName(System.Reflection.Assembly.GetExecutingAssembly().Location) + #"\Temp\" + System.IO.Path.GetFileNameWithoutExtension(currentAttachment) + ".tif";
Toranj.Base.Graphic.Imaging.SaveTiff(tempFileName, newFileName);
Bitmap tempImage = new Bitmap(tempFileName);
System.Drawing.Image newImage = Toranj.Base.Graphic.Imaging.Resize(tempImage, 1728, 2200, RotateFlipType.RotateNoneFlipXY, 204, 196);
tempImage.Dispose();
System.IO.File.Delete(tempFileName);
newImage.Save(tempFileName);
newImage.Dispose();
Toranj.Base.Graphic.Imaging.SaveTiff(tempFileName, newFileName);
//PdfDocument doc = new PdfDocument();
//doc.Pages.Add(new PdfPage());
//XGraphics xgr = XGraphics.FromPdfPage(doc.Pages[0]);
//XImage img = XImage.FromFile(tempFileName);
//xgr.DrawImage(img, 0, 0);
//doc.Save(newFileName);
//doc.Close();
fileList.Add(newFileName);
System.IO.File.Delete(tempFileName);
}
else
fileList.Add(tempFileName);
}
FAXCOMEXLib.FaxDocument currentDocument = new FAXCOMEXLib.FaxDocument();
string[] attachList = fileList.ToArray();
currentDocument.Bodies = attachList;
ShamsiDate curretnDate = new ShamsiDate(DateTime.Now);
currentDocument.DocumentName = "fax:" + curretnDate.PerSimpleDate();
currentDocument.Priority = FAXCOMEXLib.FAX_PRIORITY_TYPE_ENUM.fptHIGH;
currentDocument.Recipients.Add(currentRow["faxNumber"].ToString(), currentRow["RecipientName"].ToString());
currentDocument.AttachFaxToReceipt = true;
currentDocument.Sender.Title = "xxxxx";
currentDocument.Sender.Name = "";
currentDocument.Sender.City = "xxxxx";
currentDocument.Sender.Company = "xxxxx";
currentDocument.Sender.Country = "xxxxx";
currentDocument.Sender.Email = "xxxxx";
currentDocument.Sender.FaxNumber = "";
currentDocument.Sender.HomePhone = "";
currentDocument.Sender.TSID = "";
currentDocument.Sender.SaveDefaultSender();
object jobsId = new object();
currentDocument.ConnectedSubmit2(sendServer, out jobsId);
string[] jobID = (string[])jobsId;
can anyone help me?
in above code, if my files was image format files, i'll first change size,and save them az tiff format and convert it to the pdf format file to sending by fax but there is no change!!!

How to programatically convert pdf to png in vb.net?

Is there a library or code somewhere that does that?
Some questions suggest software like Convert a PDF to a Transparent PNG with GhostScript
I need something that's done by program. So my site, which is an asp site, should have a function
function PNGfromPDF (someFile as String) as PNGSomething
end function
Something like that.
Any open source solution for that?
Try:
PdfDocument inputDocument = PdfReader.Open(fileNames[i], PdfDocumentOpenMode.Import);
// for each page create a new PDF file and save it on the disk
for (int pageCount = 0; pageCount < inputDocument.PageCount; pageCount++)
{
fileNameWithoutExtension = Path.GetFileNameWithoutExtension(fileNames[i]);
fileName = string.Format("{0}\\Documents\\{1}", Session.CentralWorkingDirectory, String.Format("{0} ({1}-{2}).pdf", fileNameWithoutExtension, pageCount + 1, inputDocument.PageCount));
pdfFile = PDFFile.Open(fileName);
pdfFile.SerialNumber = Configurations.PDFVIEW_KEY;
// Get image file name
string imageFileName = string.Format("{0}.png", fileName.Remove(fileName.Length - 4));
// If thumbnail already exists delete it
if (File.Exists(imageFileName))
{
File.Delete(imageFileName);
}
// Convert page to PNG and save it.
//Bitmap pageImage = pdfFile.GetPageImage(0, 32);
Bitmap pageImage = pdfFile.GetPageImage(0, 92);
pageImage.Save(imageFileName, ImageFormat.Png);
// Cleanup resources
pageImage.Dispose();
pdfFile.Dispose();
}
Here I am using below namespace...
using PdfSharp.Drawing;
using O2S.Components.PDFRender4NET; // Thrid party components so you use PDF sharp with this componets
using System.Drawing.Imaging;

Why does Image.GetThumbnailImage work differently on IIS6 and IIS7.5?

Bit of a strange question and I don't know whether anyone will have come across this one before.
We have a ASP.net page generating physical thumbnail jpeg files on a filesystem and copying fullsize images to a different location. So we input one image and we get a complete copy in one location and a small image 102*68 in a different location.
We're currently looking to finally move away from IIS6 on Server 2003 to IIS7.5 on Server 2008R2, except there's on problem.
On the old system (so IIS6/Server 2003) the black borders are removed and the image stays at the correct ration. On the new system (IIS7.5/Server 2008) the thumbnails are rendered exactly as they exist in the JPEG, with black borders, but this makes the thumbnail slightly squashed and obviously includes ugly black borders.
Anyone know why this might be happening? I've done a google and can't seem to find out which behaviour is "correct". My gut tells me that the new system is correctly rendering the thumbnail as it exists, but I don't know.
Anyone have any suggestions how to solve the problem?
I think as suggested it is the .net differences. not IIS.
Just re write your code, your save a lot of time, very simple thing to do.
Here is a image handler i wrote a while ago that re draws any image to your settings.
public class image_handler : IHttpHandler
{
public void ProcessRequest(HttpContext context)
{
// set file
string ImageToDraw = context.Request.QueryString["FilePath"];
ImageToDraw = context.Server.MapPath(ImageToDraw);
// Grab images to work on's true width and height
Image ImageFromFile = Image.FromFile(ImageToDraw);
double ImageFromFileWidth = ImageFromFile.Width;
double ImageFromFileHeight = ImageFromFile.Height;
ImageFromFile.Dispose();
// Get required width and work out new dimensions
double NewHeightRequired = 230;
if (context.Request.QueryString["imageHeight"] != null)
NewHeightRequired = Convert.ToDouble(context.Request.QueryString["imageHeight"]);
double DivTotal = (ImageFromFileHeight / NewHeightRequired);
double NewWidthValue = (ImageFromFileWidth / DivTotal);
double NewHeightVale = (ImageFromFileHeight / DivTotal);
NewWidthValue = ImageFromFileWidth / (ImageFromFileWidth / NewWidthValue);
NewHeightVale = ImageFromFileHeight / (ImageFromFileHeight / NewHeightVale);
// Set new width, height
int x = Convert.ToInt16(NewWidthValue);
int y = Convert.ToInt16(NewHeightVale);
Bitmap image = new Bitmap(x, y);
Graphics g = Graphics.FromImage(image);
Image thumbnail = Image.FromFile(ImageToDraw);
// Quality Control
g.InterpolationMode = InterpolationMode.HighQualityBicubic;
g.SmoothingMode = SmoothingMode.HighQuality;
g.PixelOffsetMode = PixelOffsetMode.HighQuality;
g.CompositingQuality = CompositingQuality.HighQuality;
g.DrawImage(thumbnail, 0, 0, x, y);
g.Dispose();
image.Save(context.Response.OutputStream, ImageFormat.Jpeg);
image.Dispose();
}
public bool IsReusable
{
get
{
return true;
}
}

image management in asp.net?

in my application i am uploading the image into database. before it is store in the database, i want to do image management like decreasing size and decreasing height and width of the image. can u help me. is there any source code or any reference please.
What codebehind language are you using?
I find 4guysfromrolla to be a good ASP.NET reference, try this article for starters:
https://web.archive.org/web/20211020111640/https://www.4guysfromrolla.com/articles/012203-1.aspx
If you're talking about something like creating a thumbnail image. the Image() class will let you scale an existing image up or down. YMMV.
You want to be looking at the System.Drawing name space for manipulating images with ASP.NET. You can load any supported image file type (i.e. jpg, gif, png, etc) using Image.FromFile(), Image.FromStream(), etc. From there you use the Drawing Graphics context to manipulate the image. To give you a flavour here is my resize image function:
// Creates a re-sized image from the SourceFile provided that retails the same aspect ratio of the SourceImage.
// - If either the width or height dimensions is not provided then the resized image will use the
// proportion of the provided dimension to calculate the missing one.
// - If both the width and height are provided then the resized image will have the dimensions provided
// with the sides of the excess portions clipped from the center of the image.
public static Image ResizeImage(Image sourceImage, int? newWidth, int? newHeight)
{
bool doNotScale = newWidth == null || newHeight == null; ;
if (newWidth == null)
{
newWidth = (int)(sourceImage.Width * ((float)newHeight / sourceImage.Height));
}
else if (newHeight == null)
{
newHeight = (int)(sourceImage.Height * ((float)newWidth) / sourceImage.Width);
}
var targetImage = new Bitmap(newWidth.Value, newHeight.Value);
Rectangle srcRect;
var desRect = new Rectangle(0, 0, newWidth.Value, newHeight.Value);
if (doNotScale)
{
srcRect = new Rectangle(0, 0, sourceImage.Width, sourceImage.Height);
}
else
{
if (sourceImage.Height > sourceImage.Width)
{
// clip the height
int delta = sourceImage.Height - sourceImage.Width;
srcRect = new Rectangle(0, delta / 2, sourceImage.Width, sourceImage.Width);
}
else
{
// clip the width
int delta = sourceImage.Width - sourceImage.Height;
srcRect = new Rectangle(delta / 2, 0, sourceImage.Height, sourceImage.Height);
}
}
using (var g = Graphics.FromImage(targetImage))
{
g.SmoothingMode = SmoothingMode.HighQuality;
g.InterpolationMode = InterpolationMode.HighQualityBicubic;
g.DrawImage(sourceImage, desRect, srcRect, GraphicsUnit.Pixel);
}
return targetImage;
}
You can either use the image class or a 3rd party DLL such as ASPJPEG. A few ASPJPEG samples can be found here. I do a lot of image processing and my host reliablesite supports this dll on their servers.

Resources