how to download and display an image from an URL in R? - r

My goal is to download an image from an URL and then display it in R.
I got an URL and figured out how to download it. But the downloaded file can't be previewed because it is 'damaged, corrupted, or is too big'.
y = "http://upload.wikimedia.org/wikipedia/commons/5/5d/AaronEckhart10TIFF.jpg"
download.file(y, 'y.jpg')
I also tried
image('y.jpg')
in R, but the error message shows like:
Error in image.default("y.jpg") : argument must be matrix-like
Any suggestions?

If I try your code it looks like the image is downloaded. However, when opened with windows image viewer it also says it is corrupt.
The reason for this is that you don't have specified the mode in the download.file statement.
Try this:
download.file(y,'y.jpg', mode = 'wb')
For more info about the mode is see ?download.file
This way at least the file that you downloaded is working.
To view the image in R, have a look at
jj <- readJPEG("y.jpg",native=TRUE)
plot(0:1,0:1,type="n",ann=FALSE,axes=FALSE)
rasterImage(jj,0,0,1,1)
or how to read.jpeg in R 2.15
or Displaying images in R in version 3.1.0

this could work too
here
library("jpeg")
library("png")
x <- "http://upload.wikimedia.org/wikipedia/commons/5/5d/AaronEckhart10TIFF.jpg"
image_name<- readJPEG(getURLContent(x)) # for jpg
image_name<- readPNG(getURLContent(x)) # for png

After downloading the image, you can use base R to open the file using your default image viewer program like this:
file.show(yourfilename)

Related

Wordcloud showing image but not words in R

Having trouble getting the words to show up on an image mask for a word cloud in R.
Using this Simpsons PNG (https://imgbin.com/png/PV5MuKbG/lisa-simpson-bart-simpson-homer-simpson-maggie-simpson-mayor-quimby-png)
Code is below:
wc1 <- sort(table(bplot_one$word), decreasing = TRUE)
figPath <- "Simpsons.png"
wordcloud2(wc1, figPath = figPath)
It executed fine, but all I get is the png without the words
Any idea how to fix this?
Thanks
This is listed as an open issue on the package site: Fig Mask and lettercloud are not working with package installed from github #68. There is a workaround posted: mask and letterCloud silently fail #12.
The workaround is to refresh the viewer or open in a browser.

Down load data from HTTPS in R

I have a problem with downloading data from HTTPS in R, I try using curl, but it doesn't work.
URL <- "https://github.com/Bitakhparsa/Capstone/blob/0850c8f65f74c58e45f6cdb2fc6d966e4c160a78/Plant_1_Generation_Data.csv"
options('download.file.method'='curl')
download.file(URL, destfile = "./data.csv", method="auto")
I downloaded the CSV file with that code, but the format was changed when I checked the data. So it didn't download correctly.
Would you please someone help me?
I think you might actually have the URL wrong. I think you want:
https://raw.githubusercontent.com/Bitakhparsa/Capstone/0850c8f65f74c58e45f6cdb2fc6d966e4c160a78/Plant_1_Generation_Data.csv
Then you can download the file directly using library(RCurl) rather than creating a variable with the URL
library(RCurl)
download.file("https://raw.githubusercontent.com/Bitakhparsa/Capstone/0850c8f65f74c58e45f6cdb2fc6d966e4c160a78/Plant_1_Generation_Data.csv",destfile="./data.csv",method="libcurl")
You can also just load the file directly into R from the site using the following
URL <- "https://github.com/Bitakhparsa/Capstone/blob/0850c8f65f74c58e45f6cdb2fc6d966e4c160a78/Plant_1_Generation_Data.csv"
out <- read.csv(textConnection(URL))
You can use the 'raw.githubusercontent.com' link, i.e. in the browser, when you go to "https://github.com/Bitakhparsa/Capstone/blob/0850c8f65f74c58e45f6cdb2fc6d966e4c160a78/Plant_1_Generation_Data.csv" you can click on the link "View raw" (it's above "Sorry about that, but we can’t show files that are this big right now.") and this takes you to the actual data. You also have some minor typos.
This worked as expected for me:
url <- "https://raw.githubusercontent.com/Bitakhparsa/Capstone/0850c8f65f74c58e45f6cdb2fc6d966e4c160a78/Plant_1_Generation_Data.csv"
download.file(url, destfile = "./data.csv", method="auto")
df <- read.csv("~/Desktop/data.csv")

How to scrape a downloaded PDF file with R

I’ve recently gotten into scraping (and programming in general) for my internship, and I came across PDF scraping. Every time I try to read a scanned pdf with R, I can never get it to work. I’ve tried using the file.choose() function to no avail. Do I need to change my directory, or how can I get the pdf from my files into R?
The code looks something like this:
> library(pdftools)
> text=pdf_text("C:/Users/myname/Documents/renewalscan.pdf")
> text
[1] ""
Also, using pdftables leads me here:
> library(pdftables)
> convert_pdf("C:/Users/myname/Documents/renewalscan.pdf","my.csv")
Error in get_content(input_file, format, api_key) :
Bad Request (HTTP 400).
You should use the packages pdftools and pdftables.
If you are trying to read text inside the pdf, then use pdf_text() function. What goes inside is the path (in your computer or web) to the pdf. For example
tt = pdf_text("C:/Users/Smith/Documents/my_file.pdf")
It would be nice if you were more specif and also give us reproducible example.
To use the PDFTables R package, you need to the run the following command:
convert_pdf('test/index.pdf', output_file = NULL, format = "xlsx-single", message = TRUE, api_key = "insert_API_key")
If you are looking to get tabular data, you might try tabulizer. Here is a full code tutorial: https://www.business-science.io/code-tools/2019/09/23/tabulizer-pdf-scraping.html
Basically, you can use this code from the tutorial:
library(tabulizer)
extract_tables(
file = "2019-09-23-tabulizer/endangered_species.pdf",
method = "decide",
output = "data.frame")

Unable to open png device in loop

I've been fiddling around with a function in R, where, long story short, I have a for-loop, and at each step, I save a plot using png, then immediately readPNG so that I can extract RGB information. I then make a second plot, then readPNG this so I can compare the RGB of the two images.
The problem is that I keep getting an error message about being unable to start the png() device, or to open the file for writing, after a number of loops (can be as few as a handful of loops, or as many as a few thousand).
Here is really simplified code, but it has the bare essentials, and generates the error message:
testfun<-function(beg,fini)
{
library(png)
setwd("D://mydirectory")
for (i in beg:fini)
{
png("test.png",width=277,height=277) #candidate image
par(mai=c(0,0,0,0))
plot(1,type="n",ann=FALSE,xlim=c(0,255),ylim=c(0,255),
xaxt="n",yaxt="n",frame.plot=F)
polygon(x=c(10,60,60),y=c(10,10,60),col="red")
graphics.off()
image<-readPNG("test.png")
#code where I get rgb values for original
png("test2.png",width=277,height=277) #candidate image with diferent params
par(mai=c(0,0,0,0))
plot(1,type="n",ann=FALSE,xlim=c(0,255),ylim=c(0,255),
xaxt="n",yaxt="n",frame.plot=F)
polygon(x=c(10,60,60),y=c(10,10,60),col="blue")
graphics.off()
image<-readPNG("test2.png")
#code where I get rgb values for second image, and compare
}
}
And the error message:
Error in png("test.png", width = 277, height = 277) :
unable to start png() device
In addition: Warning messages:
1: In png("test.png", width = 277, height = 277) :
Unable to open file 'test.png' for writing
2: In png("test.png", width = 277, height = 277) : opening device failed
Originally I had graphics.off() as dev.off() but then thought maybe the loop was so fast that turning off one device wasn't fast enough before needing to be open again and it was getting 'confused' somehow. I also tried using Sys.sleep(0.1) after each graphics.off, but that didn't help either. Am I missing something stupid and obvious, or is this just a device bug?
I've had the same problem occur, although not in a loop situation. In my case, it was because I was pointing the .png output to a directory that did not exist.
png('./tweets/graphics/unique words.png', width=12, height=8, units='in', res=300)
Once I created the directory, and referenced it correctly, the error went away and I got my .png image.
I had this issue while saving plots in a loop also. #Dino Fire gave me a hint, my loop-generated file name contained an illegal character...
Ensure that the file name is legal (look for slashes, ampersands, apostrophes etc.)
For me, the reason readPNG() wasn't working was because I was running it from within a Rmd (RMarkdown) file.
As soon as I ran the code in the R console or a regular script, it worked immediately.
if you are using RStudio (or R) set working directory to where pictures are (.jpg, .png) . It should be a directory, not just (C:/).
getwd()
setwd("C:/RCode/Deep Learning/Downloads/")
getwd()

Download png/jpg with R

i would like to download all of images from this site but after downloading photos all are corrupted. What i should do to download them successfully?
My code:
library(XML)
dir.create('c:/photos')
urls<-paste("http://thedevilsguard.tumblr.com/page/",1:1870,sep="")
doc<-htmlParse(urls[1])
links<-unique(unlist(xpathApply(doc,'//div[#class="timestamp"]/a',xmlGetAttr,'href')))
for (i in 1:length(links)){
doc2<-htmlParse(links[i])
link<-xpathApply(doc2,'//div[#class="centre photopage"]//p//img',xmlGetAttr,'src')[[1]][1]
download.file(link,paste("C:/photos/",basename(link),""))
}
So it looks you are under Windows. When you download binary files, you have to specify the mode to be binary, e.g.
download.file(link, ..., mode = 'wb')
see ?download.file for details.
First, try and download one. Do this:
link = "http://29.media.tumblr.com/tumblr_m0q2g8mhGK1qk6uvyo1_500.png"
download.file(link,basename(link))
Does that work?
I notice its a PNG and NOT a JPEG, so maybe you are trying to read it in as a JPEG.

Resources