Downloading images of a specific format using bing-image-downloader

Downloading images of a specific format using bing-image-downloader - web-scraping

Is there a way to download only jpg or png or another specific format images using bing-image-downloader in python?
This is the documented code for using bing-image-downloader...
downloader.download(query_string, limit=100, output_dir='dataset', adult_filter_off=True, force_replace=False, timeout=60, verbose=True)
Is there something that can be added to this to, let's say, only download .jpg images from the web?

Related

Why can't I view or open downloaded PNG file from URL

I have downloaded an image (in fact several images using a for loop) using the below code. However, these images are not opening up, though they seem to have got downloaded completely. In fact these images are not opening up in plain Photo editor or Paint etc., tools. Appreciate inputs and what shall be done..
Below is the code that I tried with for loop:
p <- c("http://assets.pokemon.com/assets/cms2/img/pokedex/full/001.png",
"http://assets.pokemon.com/assets/cms2/img/pokedex/full/002.png",
"http://assets.pokemon.com/assets/cms2/img/pokedex/full/003.png",
"http://assets.pokemon.com/assets/cms2/img/pokedex/full/003_f2.png",
"http://assets.pokemon.com/assets/cms2/img/pokedex/full/004.png")
p
for (url in p)
download.file(url, destfile=file.path("C:/Users/xyz/Desktop/test",basename(url)))
library(imager)
# loading only first image for viewing
i <- load.image("C:/Users/xyz/Desktop/test/001.png")
plot(i)
Then I just downloaded a single file giving a simple destination name and tried to load and display using the below code.
download.file("http://assets.pokemon.com/assets/cms2/img/pokedex/full/001.png",
destfile=file.path("C:/Users/xyz/Desktop/test","first_img.png"))
i_s <- load.image("C:/Users/xyz/Desktop/test/first_img.png")
plot(i_s)
In both the cases I am getting the below error message.
Error in read.bitmap(file) :
File f: C:/Users/xyz/Desktop/test/001.png does not appear to be a PNG, BMP, JPEG, or TIFF
Same way, if I try to open the downloaded images using Photos, Photos editor, Adobe, Paint etc., I get similar messages like format not supported, unable to load photo, etc., messages. However, note that if I simply copy and paste the image url in the browser, the image appears perfectly in the web page.
Appreciate inputs on what can be done here.

Loos like you have to set mode = "wb" in download.file. The manual says:
The choice of binary transfer (‘mode = "wb"’ or ‘"ab"’) is important on Windows, since unlike Unix-alikes it does distinguish between text and binary files and for text transfers changes ‘\n’ line endings to ‘\r\n’ (aka ‘CRLF’).
On Windows, if ‘mode’ is not supplied (‘missing()’) and ‘url’ ends in one of ‘.gz’, ‘.bz2’, ‘.xz’, ‘.tgz’, ‘.zip’, ‘.jar’, ‘.rda’, ‘.rds’ or ‘.RData’, ‘mode = "wb"’ is set so that a binary transfer is done to help unwary users.
So for the single file try:
download.file("http://assets.pokemon.com/assets/cms2/img/pokedex/full/001.png",
destfile=file.path("C:/Users/vsvas/Desktop/test","first_img.png"),
mode = "wb")

Converting .mbtiles to .png images

We are downloaded map.mbtiles from openmaptiles.com.
Now we are tyring to convert that map.mbtiles to png images.
We tried mbutil to convert but images we got those are not supported.
We need method or process to convert it.

The easiest way would be using the Tileserver-GL to render png raster tiles from your mbtiles. Documentation is avaible here: https://tileserver.readthedocs.io/en/latest/

Why does R raster::writeRaster() generate a pic which can't be shown in Win10?

I read my hyperspectral (.raw) file and combine three bands to "gai_out_r" Then I output as following:
writeRaster(gai_out_r,filepath,format="GTiff")
finally I got gai_out_r.tif
But, why Win10 can't display this small tif as the pic that I output the same way from envi--save image as--tif
Two tiffs are displayed by Win10 as following:

Default windows image viewing applications doesn't support Hyperspectral Images-since you are just reading and combining 3 bands from your .raw file, the resulting image will be a hyperspectral image.You need to have separate dedicated softwares to view hypercubes or can view it using spectral-python also.
In sPy, using envi.save_image , will save it as a ENVI type file only. To save it as an rgb image file(readable in windows OS) we need to use other methods.

You are using writeRaster to write to a GTiff (GeoTiff) format file. To write to a standard tif file you can use the tiff method. With writeRaster you could also write to a PNG instead
writeRaster(gai_out_r, "gai.png")

Cause of the issue:
I had a similar issue and recognised that the exported .tif files had a different bit depth than .tif images I could open. The images could not be displayed using common applications, although they were not broken and I could open them in R or QGIS. Hence, the values were coded in a way Windows would not expect.
When you type ?writeRaster() you will find that there are various options when it comes to saving a .tif (or other format) using the raster::writeRaster() function. Click on the links therein to get to the dataType {raster} help site and you'll find there are various integer types to choose from.
Solution (write a Windows-readable GeoTIFF):
I set the following options to make the resulting .tif file readable (note the datatype option):
writeRaster(raster, filename = "/path/to/your/output.tif",
format = "GTiff", datatype = "INT1U")
Note:
I realised your post is from 2 and a half years ago... Anyways, may this answer help others who encounter this problem.

Convert from tiff to other multipage format

I am looking for a way to convert tif files into pdf or image format with multiple files using plsql. What I need to achieve is displaying converted pdf (or other format) with multiple pages from tif file and show it in browser.
What I did so far was to convert from tif to png but since png it's single file it's not what i'm looking for .
ordsys.ordimage.process(dest_loc, 'fileFormat=PNG');
ordsys.ordimage.getproperties(dest_loc, v_clob);
Chrome and Firefox doesn't support TIFF format anymore.
Or any other idea it's great. Thank you!

unable to perform OCR on tiff and jpeg files

I am referring to " https://github.com/keensoft/alfresco-simple-ocr" to perform OCR on tiff and jpeg files but is saying "Couldn't find trailer dictionary","Couldn't read xref table"," exception Failure("Error: pdfinfo could not determine number of pages. Check the pdf input file.\n")" although the transformation from jpeg or tiff files to PDF files is working properly and the PDF file is visible on the alfresco share page" but no OCR is working on those tiff and jpeg files

Basically there are many tools which are used for performing the OCR on pdf files.It depends on the tool as well.There is one bug in alfresco.It is an library issue.Below are details of that.
Create file called transformation.sh and before adding your command in it you have to add below line in it.If you are using windows you need to create batch file accordingly.
unset LD_LIBRARY_PATH
If you are not setting above in the script file you will face an error while conversation.You can find that bug details on below link of alfresco.Its registered issue in alfresco.
https://issues.alfresco.com/jira/browse/ALF-19946
PDF to PDF conversation are very well explained in below link.
http://www.krutikjayswal.com/2016/07/ocr-on-pdf-file-in-alfresco.html
You might need to change the source code for tiff conversation.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Downloading images of a specific format using bing-image-downloader - web-scraping

Related

Why can't I view or open downloaded PNG file from URL

Converting .mbtiles to .png images

Why does R raster::writeRaster() generate a pic which can't be shown in Win10?

Convert from tiff to other multipage format

unable to perform OCR on tiff and jpeg files

Categories

Resources