Improve zbarimg qrcode recognition - qr-code

I had a working system of scanning sheets of paper and then letting zbarimg recognize qrcodes on these sheets (while I don't know in which area the qrcode appears). Suddenly qrcode recognition got much worse and eventually stopped working at all.
The physical scanner generates PDFs from the scanned sheets of paper. I use ghostscript to convert it to a picture:
gs -sDEVICE=png16m -sCompression=lzw -r600x600 -dNOPAUSE -sOutputFile='scantest.png' scantest.pdf
This is the result:
When you try to read the qrcode with your smartphone, it will be recognized immediately. But when I run zbarimg:
zbarimg scantest.png
Zbarimg doesn't recognize anything:
scanned 0 barcode symbols from 1 images in 6,6 seconds
I tried to apply this solution:
https://stackoverflow.com/a/40609947/4654597
But without any luck, actually it destroyed the qrcode totally:
I also tried to apply a light blur filter like suggested in this post:
Decode QR-Code in scanned PDF
I used ImageMagick for this task:
convert scantest.png -blur 1x1 scantest_after_blur.png
I also tried 1x2, 1x3, 1x4, 1x6, 1x8 but nothing helped.
How could I get zbarimg to work again?

Here is what finally worked:
convert input.png +repage -threshold 50% -morphology open square:1 output.png
zbarimg output.png
Most important is probably applying morphology. I got the whole ImageMagick command from this post: QR code detection with ZBar from console fails for valid QR codes (ZBarCam from camera detects them fine)

Related

Merging/concatenating video and keeping sound in R using AV package

I am trying to merge/concatenate multiple videos with sound sequentially into one video using only R, (I don't want to work with ffmpeg in the command line as the rest of the project doesn't require it and I would rather not bring it in at this stage).
My code looks like the following:
dir<-"C:/Users/Admin/Documents/r_programs/"
videos<-c(
paste0(dir,"video_1.mp4"),
paste0(dir,"video_2.mp4"),
paste0(dir,"video_3.mp4")
)
#encoding
av_encode_video(
videos,
output=paste0(dir,"output.mp4"),
framerate=30,
vfilter="null",
codec="libx264rgb",
audio=videos,
verbose=TRUE
)
It almost works, the output file is an mp4 containing the 3 videos sequentially one after the other, but the audio present is only from the first of the 3 video and then it cuts off.
It doesn't really matter what the videos are. I have recreated this issue with the videos I was using or even 3 randomly downloaded 1080p 30fps videos from YouTube.
Any help is appreciated & thank you in advance.
The experienced behavior (only 1 audio source) is exactly how it is designed to do. In the C source code, you can identify that encode_video only takes the first audio entry and ignores the rest. Overall, audio is poorly supported by ropensci/av atm as its primary focus is to turn R plots into videos. Perhaps, you can file a feature request issue on GitHub.
Meanwhile, why not just use base.system function to call FFmpeg from R? This will likely speed up your process significantly assuming the videos have identical format by using concat demuxer + stream-copy (-c copy). The av library does not support this feature as far as I can tell. (If formats differ, you need to use the concat filter which is also explained in the link above.)

Thermal printer leaves too much blank space printing QR

I want to print a QR in my thermal printer. I use the following command
lpr -P POS58 qr.png
As you can see my QR is a PNG image. It prints it totally fine except for the fact that before the QR the printer leaves a lot of blank paper.
How could I fix this?
The solution I found is using again the png2escpos library Link previously reducing the size of the image to ~250px wide so it fits in the paper size (I had to try a couple of times before getting it right).
You can save the binary data in a file so you can print it directly later
./png2escpos my_qr.png > my_qr.bin

Multibyte characters reading problem in IronPdf

I am trying IronPDF. I want to insert PDF metadata to database which I read with IronPDF. However, some "ı" characters in the metadata are not read with IronPDF. Spaces are left in place of these characters. Here is my code sample:
var md = PdfDocument.FromFile("___PATH OF PDF FILE___");
var article_title = md.MetaData.Title;
When I copy paste string to Notepad++ it gives a result like this:
And here is the screenshot of application view:
Is there a way to solve this problem or is this a bug of IronPDF? If everything goes well, of course, I think of buying. But of course, if it fails on the first try, continue to iTextSharp.
EDIT: First of all, I apologize for Windows, which made me surprised. I struggled to get a new system up all day and unfortunately it's still visual studio etc. not to be installed. I added one of the files I had problems with in the below and the IronPDF version appears as 2019.7.0.0.
PDF file: https://yadi.sk/d/HwP9JWRWTzMlSA
First of all, since you haven't provided us with a sample PDF to work with; I've google some Turkish PDF documents having metadata with Turkish characters. This is the file that I came up with: link
As you can see above the Author metadata field has ı Turkish character.
Then I created a dotnet fiddle in order to test this file using IronPDF (with the latest available version - since you haven't specified any):
sample using IronPDF
The output from this sample is ElifCakroglu which is showing the exact same symptom when copied to Notepad++:
Playing with the encodings did not help resolving this issue. So I created another dotnet fiddle to test your alternative solution which was iTextSharp: sample using iTextSharp
This time everything was working as it should be: ElifCakıroglu
Note: I've also tried creating a Word 2016 document and saving it as a PDF then using that file with the above samples and both of them did not work (not accepting as a valid PDF) for some reason. After that I tried and online PDF document validator, but the file was fine. Then I used an online converter to change the PDF version with the default settings and used the output PDF with both samples and the surprising thing is that both of them worked correctly.
My conclusion is that iTextSharp is working consistently with both documents having metadata with Turkish characters present, while IronPDF works correctly 50% of the time.
I believe that this issue is resolved and can be tested in the 2020.9 release branch of IronPdf.
https://www.nuget.org/packages/IronPdf/

Convert hex code from TIFF to readable format

I am trying to read in the JPEG table from a TIFF file to locate sub-images in the TIFF file. (This is coming from a whole slide image svs file and I am trying to delete the label and macro image.) The JPEG table is hex encoded and I can't figure out to turn it to readable information to locate the sub-images.
I have tried unpacking the values. I don't want to save the file and open in Linux. I want to do this from within a jupyter notebook. I've tried for a while using "unpack" from IO core tools which didn't work. I also briefly tried BeautifulSoup, but it tells me that there is an invalid start byte. Here's the first line I am trying to decode:
b'\xff\xd8\xff\xdb\x00C\x00'
This line should return something like "JPEG image file..." I think if I can translate this line I can do the rest of this JPEG table.
Used a python TIFF package to help find the pages of the TIFF file I was looking for.

Using GhostPCL to converting PCL with images to PDF

I'm currently attempting to convert some PCL files into PDF using GhostPCL (PCL6).
For the most part this works. However, there is an odd problem with some of the conversion. For some reason, PCL6 is not converting some logos where are at the top of our documents. The logo is of the format:
^[(25XABCDEFGHIJKLMNOPQ^[(3#^M
^[(25X^[&a+1.49RRSTUVWXYZ[\]^_`ab^[(3#^M
^[(25X^[&a+1.49Rcdefghijklmnopqrs^M
when viewing the PCL file in vim. When printing the file as a PCL file, the image prints out correctly, but when converting to pdf, the following takes it's place:
ABCDEFGHIJKLMNOPQ
RSTUVWXYZ[\]^_`ab
cdefghijklmnopqrs
I recognize that the format is meant to be matched against some sort of embedded image or font, but it has been really difficult trying to find useful documentation on PCL (so I can actually figure out what these characters mean) or the conversion process.
Can anyone offer some insight on how to approach the conversion? We will need these images/logos in the converted documents since they often contain disclaimer information as part of the image.
EDIT1: I've also attempted converting to postscript and printing then and the same behavior occurs.
EDIT2: When rendering the PCL file in a viewer, the same text shows up instead of the image. But when printing, the logo does show up. Strange...
EDIT3: To clarify, sending the PCL file to a printer directly does not seem to cause the problem (i.e, the logo does print correctly). It's only when I attempt to convert it to another file format that the problem occurs.
What happens when you try rendering the PCL input with Ghostscript ? Eg to the display device. If it doesn't render its not going to end up in a PDF either.
Have you tried printing the file to a PCL printer ?
If it works to a PCL printer, but not when rendering you can open a bug against ghostpcl. If it renders but does not end up in the PDF then you can open a bug against ghostspcl with the 'pdf writer' component.
Its possible that the logo is shown using a rasterop, this is a part of the PCL imaging model which has no counterpart in PDF and so cannot be reproduced. The result of using a rasterop with the PDF device is variable, sometimes it will do what you expect, often it will not.

Resources