How to import accented public TSV Google Spreadsheet data into R - r

If I try to import a public spreadsheet like this example into R:
using:
library(httr)
url <- "https://docs.google.com/spreadsheets/d/1qIOv7MlpQAuBBgzV9SeP3gu0jCyKkKZapPrZHD7DUyQ/pub?gid=0&single=true&output=tsv"
GET(url)
I get the wrong accented words, as you can see in this picture:
How can I get the right encode?
I know I can use googlesheets package, but for public data I prefer to work with direct download, so I don't have to handle user login authentication and token refresh.

I don't know why httr::GET do not work, but this works:
data <- utils::read.csv(url, header=TRUE, sep="\t", stringsAsFactors=FALSE)

If you have a *nix operating system you could use
curl -o data.tsv 'https://docs.google.com/spreadsheets/d/1qIOv7MlpQAuBBgzV9SeP3gu0jCyKkKZapPrZHD7DUyQ//pub?gid=0&single=true&output=tsv'

Related

How import excel file from the browser

I want to use GET() function from httr package, because this is just an example file and in the original file I need to write in user name and password i.e.
library(httr)
filename<-"filename_in_url.xls"
URL <- "originalurl"
GET(URL, authenticate("usr", "pwd"), write_disk(paste0("C:/Temp/temp/",filename), overwrite = TRUE))
As a test, I tried to import one of the files from I want to import one of the files from https://www.nordpoolgroup.com/historical-market-data/ and do not save it to the disk, but save it to the environment in order to see the data. However, it also does not work.
library(XML)
library(RCurl)
excel <- readHTMLTable(htmlTreeParse(getURL(paste("https://www.nordpoolgroup.com/4a4c6b/globalassets/marketdata-excel-files/elspot-prices_2021_hourly_eur.xls")), useInternalNodes=TRUE))[[1]]
Or if there are other ways how to import data (functions where login information can be as an input)m it will be great to see them

Fail to authenticate BigQuery with R under the bigrquery package

I am trying to use set_service_token in the bigrquery package for a non-interactive authentication.
Here is my code:
library(bigrquery)
set_service_token("client_secret.json")
But it kept showing the error message below:
Error in read_input(file) :
file must be connection, raw vector or file path
However, when I simply to read the JSON path, it works:
lapply(fromJSON("client_secret.json"), names)
$`installed`
[1] "client_id" "project_id" "auth_uri" "token_uri" "auth_provider_x509_cert_url" "client_secret" "redirect_uris"
Can anyone help me with this? Thank you very much!
Looks like your JSON file is in the current directory, but you need the full path to supply the token JSON file. Try this:
json_path <- paste(getwd(), "/client_secret.json", sep="")
set_service_token(json_path)
If that doesn't work, you may try it using the environment variables, like this:
Sys.setenv("CLIENT_SECRET_FILE" = json_path)
set_service_token(Sys.getenv('CLIENT_SECRET_FILE'))
Or, try to supply the JSON content, like this:
set_service_token(toJSON(fromJSON("client_secret.json"), pretty = TRUE))
You may also try using gar_auth_service:
library(googleAuthR)
gar_auth_service(
json_file = "client_secret.json" # or better use the full path instead
)
Hope it works.

Import OData dataset

How do I properly download and load in R an OData dataset?
I tried the OData package, and even if the documentation is really simple, I am sure, I am missing something trivial.
I am trying to download and parse in R this dataset, but I cannot get how it is structured. Is it a XML format? Hence, what is the reason for a separator argument?
library(OData)
#What is the correct argument for the separator?
downloadResourceCsv("https://data.nasa.gov/OData.svc/gh4g-9sfh", sep = "")
As hrbrmstr suggests, use the RSocrata package
e.g., go to 1, click on ... in the top right,
click on "Access this Dataset via OData", click
on "Copy" to copy the OData endpoint, save it:
url <- "https://data.cdc.gov/api/odata/v4/9bhg-hcku"
library(RSocrata)
dat <- read.socrata(url)
It's XML format.So download first.
Try using httr package.
library(httr)
r <- GET("http://httpbin.org/get")
Visit this site for quick-start.
After download use XML package for xmlParse.
Thank you

How to download an .xlsx file from a dropbox (https:) location

I'm trying to adopt the Reproducible Research paradigm but meet people who like looking at Excel rather than text data files half way, by using Dropbox to host Excel files which I can then access using the .xlsx package.
Rather like downloading and unpacking a zipped file I assumed something like the following would work:
# Prerequisites
require("xlsx")
require("ggplot2")
require("repmis")
require("devtools")
require("RCurl")
# Downloading data from Dropbox location
link <- paste0(
"https://www.dropbox.com/s/",
"{THE SHA-1 KEY}",
"{THE FILE NAME}"
)
url <- getURL(link)
temp <- tempfile()
download.file(url, temp)
However, I get Error in download.file(url, temp) : unsupported URL scheme
Is there an alternative to download.file that will accept this URL scheme?
Thanks,
Jon
You have the wrong URL - the one you are using just goes to the landing page. I think the actual download URL is different, I managed to get it sort of working using the below.
I actually don't think you need to use RCurl or the getURL() function, and I think you were leaving out some relatively important /'s in your previous formulation.
Try the following:
link <- paste("https://dl.dropboxusercontent.com/s",
"{THE SHA-1 KEY}",
"{THE FILE NAME}",
sep="/")
download.file(url=link,destfile="your.destination.xlsx")
closeAllConnections()
UPDATE:
I just realised there is a source_XlsxData function in the repmis package, which in theory should do the job perfectly.
Also the function below works some of the time but not others, and appears to get stuck at the GET line. So, a better solution would be very welcome.
I decided to try taking a step back and figure out how to download a raw file from a secure (https) url. I adapted (butchered?) the source_url function in devtools to produce the following:
download_file_url <- function (
url,
outfile,
..., sha1 = NULL)
{
require(RCurl)
require(devtools)
require(repmis)
require(httr)
require(digest)
stopifnot(is.character(url), length(url) == 1)
filetag <- file(outfile, "wb")
request <- GET(url)
stop_for_status(request)
writeBin(content(request, type = "raw"), filetag)
close(filetag)
}
This seems to work for producing local versions of binary files - Excel included. Nicer, neater, smarter improvements in this gratefully received.

RCurl and getURL problems with "

I download some JSON files from Twitter with this command
library(RCurl)
getURL("http://search.twitter.com/search.json?since=2012-12-06&until=2012-12-07&q=royalbaby&result_type=recent&rpp=10&page=1")
But now all double quotes are transformed to \". In some special cases this destroys the JSON format. I think getURL or curl will make this change. Is there any way to suppress this action?
Thanks
Markus
Your page contain the "\ , it is not Rcurl behavior ( try to open the page with a browser)
library(RJSONIO)
library(RCurl)
raw_data <- getURL(you.url)
data <- fromJSON(raw_data)
The data is well formated.
Use cat to avoid \ representation.

Resources