I'm saving imaging from the website.
I see it appears and is 1.2Mb
But I fail to open it
download.file(
'http://www.sothebys.com/content/dam/stb/lots/N09/N09781/101N09781_994Y9.jpg',
method='wb',
destfile='~/i.jpg')
when I try readJPEG, I get "JPEG decompression error: Bogus marker length"
I don't see the method 'wb' in the official documentaion. Try the 'auto' method:
download.file('http://www.sothebys.com/content/dam/stb/lots/N09/N09781/101N09781_994Y9.jpg',method='auto', destfile='~/i.jpg')
As #Rohit pointed out, method = 'auto' works fine.
library(jpeg)
download.file('http://www.sothebys.com/content/dam/stb/lots/N09/N09781/101N09781_994Y9.jpg',
method='auto', destfile='i.jpg')
x <- readJPEG("i.jpg")
This will return x as a large array.
Related
I have a problem with downloading data from HTTPS in R, I try using curl, but it doesn't work.
URL <- "https://github.com/Bitakhparsa/Capstone/blob/0850c8f65f74c58e45f6cdb2fc6d966e4c160a78/Plant_1_Generation_Data.csv"
options('download.file.method'='curl')
download.file(URL, destfile = "./data.csv", method="auto")
I downloaded the CSV file with that code, but the format was changed when I checked the data. So it didn't download correctly.
Would you please someone help me?
I think you might actually have the URL wrong. I think you want:
https://raw.githubusercontent.com/Bitakhparsa/Capstone/0850c8f65f74c58e45f6cdb2fc6d966e4c160a78/Plant_1_Generation_Data.csv
Then you can download the file directly using library(RCurl) rather than creating a variable with the URL
library(RCurl)
download.file("https://raw.githubusercontent.com/Bitakhparsa/Capstone/0850c8f65f74c58e45f6cdb2fc6d966e4c160a78/Plant_1_Generation_Data.csv",destfile="./data.csv",method="libcurl")
You can also just load the file directly into R from the site using the following
URL <- "https://github.com/Bitakhparsa/Capstone/blob/0850c8f65f74c58e45f6cdb2fc6d966e4c160a78/Plant_1_Generation_Data.csv"
out <- read.csv(textConnection(URL))
You can use the 'raw.githubusercontent.com' link, i.e. in the browser, when you go to "https://github.com/Bitakhparsa/Capstone/blob/0850c8f65f74c58e45f6cdb2fc6d966e4c160a78/Plant_1_Generation_Data.csv" you can click on the link "View raw" (it's above "Sorry about that, but we can’t show files that are this big right now.") and this takes you to the actual data. You also have some minor typos.
This worked as expected for me:
url <- "https://raw.githubusercontent.com/Bitakhparsa/Capstone/0850c8f65f74c58e45f6cdb2fc6d966e4c160a78/Plant_1_Generation_Data.csv"
download.file(url, destfile = "./data.csv", method="auto")
df <- read.csv("~/Desktop/data.csv")
I'm trying to download all the comics(png) from xkcd.com and here's my code to do the download job:
imageFile=open(os.path.join('XKCD',os.path.basename(comicLink)),'wb')
for chunk in res.iter_content(100000):
imageFile.write(chunk)
imageFile.close()
And the downloadeded file is 6388Bytes and cannot be opened while the real file from link is 27.6Kb.
I've already tested my code line by line in shell, so I'm pretty sure I get the right link and right file.
I just don't understand why the png downloaded by my code is smaller.
Also I tried to search why this is happening but without helpful information.
Thanks.
Okay since you are using requests here is function that will let you download a file given a url
def download_file(url):
local_filename = url.split('/')[-1]
# NOTE the stream=True parameter
r = requests.get(url, stream=True)
with open(local_filename, 'wb') as f:
for chunk in r.iter_content(chunk_size=1024):
if chunk: # filter out keep-alive new chunks
f.write(chunk)
return local_filename
Link to the documentation -> http://docs.python-requests.org/en/latest/user/advanced/#body-content-workflow
Example link: http://www.espncricinfo.com/inline/content/image/252437.html?alt=1
I tried download.file() and getURL() methods, but it didn't work.
One option:
library(httr)
img <- httr::GET("http://www.espncricinfo.com/inline/content/image/252437.html?alt=1")\
writeBin(img$content, "localfile.jpg")
You can extract the redirected filename with img$url. In this case, you can also find mirrors for it under img$all_headers, though that isn't necessarily always the case. Try str(img) to see more of the response type that GET returns.
My goal is to download an image from an URL and then display it in R.
I got an URL and figured out how to download it. But the downloaded file can't be previewed because it is 'damaged, corrupted, or is too big'.
y = "http://upload.wikimedia.org/wikipedia/commons/5/5d/AaronEckhart10TIFF.jpg"
download.file(y, 'y.jpg')
I also tried
image('y.jpg')
in R, but the error message shows like:
Error in image.default("y.jpg") : argument must be matrix-like
Any suggestions?
If I try your code it looks like the image is downloaded. However, when opened with windows image viewer it also says it is corrupt.
The reason for this is that you don't have specified the mode in the download.file statement.
Try this:
download.file(y,'y.jpg', mode = 'wb')
For more info about the mode is see ?download.file
This way at least the file that you downloaded is working.
To view the image in R, have a look at
jj <- readJPEG("y.jpg",native=TRUE)
plot(0:1,0:1,type="n",ann=FALSE,axes=FALSE)
rasterImage(jj,0,0,1,1)
or how to read.jpeg in R 2.15
or Displaying images in R in version 3.1.0
this could work too
here
library("jpeg")
library("png")
x <- "http://upload.wikimedia.org/wikipedia/commons/5/5d/AaronEckhart10TIFF.jpg"
image_name<- readJPEG(getURLContent(x)) # for jpg
image_name<- readPNG(getURLContent(x)) # for png
After downloading the image, you can use base R to open the file using your default image viewer program like this:
file.show(yourfilename)
I'm trying to adopt the Reproducible Research paradigm but meet people who like looking at Excel rather than text data files half way, by using Dropbox to host Excel files which I can then access using the .xlsx package.
Rather like downloading and unpacking a zipped file I assumed something like the following would work:
# Prerequisites
require("xlsx")
require("ggplot2")
require("repmis")
require("devtools")
require("RCurl")
# Downloading data from Dropbox location
link <- paste0(
"https://www.dropbox.com/s/",
"{THE SHA-1 KEY}",
"{THE FILE NAME}"
)
url <- getURL(link)
temp <- tempfile()
download.file(url, temp)
However, I get Error in download.file(url, temp) : unsupported URL scheme
Is there an alternative to download.file that will accept this URL scheme?
Thanks,
Jon
You have the wrong URL - the one you are using just goes to the landing page. I think the actual download URL is different, I managed to get it sort of working using the below.
I actually don't think you need to use RCurl or the getURL() function, and I think you were leaving out some relatively important /'s in your previous formulation.
Try the following:
link <- paste("https://dl.dropboxusercontent.com/s",
"{THE SHA-1 KEY}",
"{THE FILE NAME}",
sep="/")
download.file(url=link,destfile="your.destination.xlsx")
closeAllConnections()
UPDATE:
I just realised there is a source_XlsxData function in the repmis package, which in theory should do the job perfectly.
Also the function below works some of the time but not others, and appears to get stuck at the GET line. So, a better solution would be very welcome.
I decided to try taking a step back and figure out how to download a raw file from a secure (https) url. I adapted (butchered?) the source_url function in devtools to produce the following:
download_file_url <- function (
url,
outfile,
..., sha1 = NULL)
{
require(RCurl)
require(devtools)
require(repmis)
require(httr)
require(digest)
stopifnot(is.character(url), length(url) == 1)
filetag <- file(outfile, "wb")
request <- GET(url)
stop_for_status(request)
writeBin(content(request, type = "raw"), filetag)
close(filetag)
}
This seems to work for producing local versions of binary files - Excel included. Nicer, neater, smarter improvements in this gratefully received.