Adding rows to a Google Sheet using the R Package googlesheets - r

I'm using the googlesheets package (CRAN version, but available here: https://github.com/jennybc/googlesheets) to read data from a Google Sheet in R, but would now like to add rows. Unfortunately, every time use gs_add_row for an existing sheet I get the following error:
Error in gsheets_POST(lf_post_link, XML::toString.XMLNode(new_row)) :
client error: (405) Method Not Allowed
I followed the tutorial on Github to create a sheet and add rows as follows:
library(googlesheets)
library(dplyr)
df.colnames <- c("Project Short Name","Project Start Date","Proj Stuff")
my.df <- data.frame(a = "cannot be empty", b = "cannot be empty", c = "cannot be empty")
colnames(my.df) <- df.colnames
## Create a new workbook populated by this data.frame:
mynewSheet <- gs_new("mynewsheet", input = my.df, trim = TRUE)
## Append Element
mynewSheet <- mynewSheet %>% gs_add_row(input = c("a","b","c"))
mynewKey <- mynewSheet$sheet_key
Rows are added successfully, I even get the cheery message Row successfully appended.
I now provide mynewKey to gs_key, as I would if this were a new sheet I were working with and attempt to add a new row using gs_add_row (Note: before evaluating these lines, I navigate to the Google Sheet and make it public to the web):
myExistingWorkbook <- gs_key(mynewKey, visibility = "public")
## Attempt to gs_add_row
myExistingWorkbook <- myExistingWorkbook %>% gs_add_row(input = c("a","b","c"), ws="Sheet1", verbose = TRUE)
Error in gsheets_POST(lf_post_link, XML::toString.XMLNode(new_row)) :
client error: (405) Method Not Allowed
Things that I have tried:
1) Published the Google Sheet to the web (as per https://github.com/jennybc/googlesheets/issues/126#issuecomment-118751652)
2) Enabled the sheet as editable to the public
Notes
In my actual example, I have an existing Google Sheet with many worksheets within it that I would like to add rows to. I have tried to use a minimal example here to understand my error, I can also provide a link to the specific worksheet that I would like to update as well.
I have raised an issue on the package's github page here, https://github.com/jennybc/googlesheets/issues/168

googlesheets::gs_add_row() and googlesheets::gs_edit_cells() make POST requests to the Sheets API. This requires that the visibility be set to "private".
Above, when you register the Sheet by key, please do so like this:
gs_key(mynewKey, visibility = "private")
If you want this to work even for Sheets you've never visited in the browser, then add lookup = FALSE as well:
gs_key(mynewKey, lookup = FALSE, visibility = "private")

Related

How to scrape from investing.com using 'rusquant' package in R

I posted a similar question before [the question is closed now, I deleted it]. From that I came to know about 'rusquant' package. I thank the person who introduced me to the package 'rusquant' here. I tried the following codes in several unsuccessful attempts to scrape stock data from investing.com
library(rusquant)
all_stocks <- getSymbolList(src = "Investing", country = "Bangladesh")
head(all_stocks, 4)
from_date <- date("2021-01-01")
grameenphone <- getSymbols('GRAE', src = 'Investing', from = from_date, auto.assign = F)
grameenphone <- getSymbols.Investing('GRAE', from = from_date, auto.assign = F)
Now, the getSymbolList function works. But when I try to scrape for a particular stock, and I followed the method from https://github.com/arbuzovv/rusquant, I get an error. as follows:
grameenphone <- getSymbols('GRAE', src = 'Investing', from = from_date, auto.assign = F)
‘getSymbols’ currently uses auto.assign=TRUE by default, but will
use auto.assign=FALSE in 0.5-0. You will still be able to use
‘loadSymbols’ to automatically load data. getOption("getSymbols.env")
and getOption("getSymbols.auto.assign") will still be checked for
alternate defaults.
This message is shown once per session and may be disabled by setting
options("getSymbols.warning4.0"=FALSE). See ?getSymbols for details.
Error in curl::curl_fetch_memory(url, handle = handle) :
Unrecognized content encoding type. libcurl understands deflate, gzip content encodings.
Then I tried getSymbols.Investing function. But I get following error:
grameenphone <- getSymbols.Investing('GRAE', from = from_date, auto.assign = F)
Error in missing(verbose) : 'missing' can only be used for arguments
Please help me out here. I'm new in coding. I apologize if anything silly happened here. Thanks in advance.

how to index data using index_create() in elastic package in R

This is my code in R to index iris data.
library(elastic)
iris<-datasets::iris
body <- list(data = list(iris))
index_create(index = 'iris',body = body)
but it gives the following error.
Error: 400 - Failed to parse content to map.
Please explain how to give data in the body of the index_create();
elastic maintainer here. index_create is only for creating an index as the function name indicates. That is, create an index, not create an index and insert data into the index. From your example you probably want
index_create(index = "iris")
docs_bulk(iris, "iris")

R - form web scraping with rvest

First I'd like to take a moment and thank the SO community,
You helped me many times in the past without me needing to even create an account.
My current problem involves web scraping with R. Not my strong point.
I would like to scrap http://www.cbs.dtu.dk/services/SignalP/
what I have tried:
library(rvest)
url <- "http://www.cbs.dtu.dk/services/SignalP/"
seq <- "MTSKTCLVFFFSSLILTNFALAQDRAPHGLAYETPVAFSPSAFDFFHTQPENPDPTFNPCSESGCSPLPVAAKVQGASAKAQESDIVSISTGTRSGIEEHGVVGIIFGLAFAVMM"
session <- rvest::html_session(url)
form <- rvest::html_form(session)[[2]]
form <- rvest::set_values(form, `SEQPASTE` = seq)
form_res_cbs <- rvest::submit_form(session, form)
#rvest prints out:
Submitting with 'trunc'
rvest::html_text(rvest::html_nodes(form_res_cbs, "head"))
#ouput:
"Configuration error"
rvest::html_text(rvest::html_nodes(form_res_cbs, "body"))
#ouput:
"Exception:WebfaceConfigErrorPackage:Webface::service : 358Message:Unhandled #parameter 'NULL' in form "
I am unsure what is the unhandled parameter.
Is the problem in the submit button? I can not seem to force:
form_res_cbs <- rvest::submit_form(session, form, submit = "submit")
#rvest prints out
Error: Unknown submission name 'submit'.
Possible values: trunc
is the problem the submit$name is NULL?
form[["fields"]][[23]]
I tried defining the fake submit button as suggested here:
Submit form with no submit button in rvest
with no luck.
I am open to solutions using rvest or RCurl/httr, I would like to avoid using RSelenium
EDIT: thanks to hrbrmstr awesome answer I was able to build a function for this task. It is available in the package ragp: https://github.com/missuse/ragp
Well, this is doable. But it's going to require elbow grease.
This part:
library(rvest)
library(httr)
library(tidyverse)
POST(
url = "http://www.cbs.dtu.dk/cgi-bin/webface2.fcgi",
encode = "form",
body=list(
`configfile` = "/usr/opt/www/pub/CBS/services/SignalP-4.1/SignalP.cf",
`SEQPASTE` = "MTSKTCLVFFFSSLILTNFALAQDRAPHGLAYETPVAFSPSAFDFFHTQPENPDPTFNPCSESGCSPLPVAAKVQGASAKAQESDIVSISTGTRSGIEEHGVVGIIFGLAFAVMM",
`orgtype` = "euk",
`Dcut-type` = "default",
`Dcut-noTM` = "0.45",
`Dcut-TM` = "0.50",
`graphmode` = "png",
`format` = "summary",
`minlen` = "",
`method` = "best",
`trunc` = ""
),
verbose()
) -> res
Makes the request you made. I left verbose() in so you can watch what happens. It's missing the "filename" field, but you specified the string, so it's a good mimic of what you did.
Now, the tricky part is that it uses an intermediary redirect page that gives you a chance to enter an e-mail address for notification when the query is done. It does do a regular (every ~10s or so) check to see if the query is finished and will redirect quickly if so.
That page has the query id which can be extracted via:
content(res, as="parsed") %>%
html_nodes("input[name='jobid']") %>%
html_attr("value") -> jobid
Now, we can mimic the final request, but I'd add in a Sys.sleep(20) before doing so to ensure the report is done.
GET(
url = "http://www.cbs.dtu.dk/cgi-bin/webface2.fcgi",
query = list(
jobid = jobid,
wait = "20"
),
verbose()
) -> res2
That grabs the final results page:
html_print(HTML(content(res2, as="text")))
You can see images are missing because GET only retrieves the HTML content. You can use functions from rvest/xml2 to parse through the page and scrape out the tables and the URLs that you can then use to get new content.
To do all this, I used burpsuite to intercept a browser session and then my burrp R package to inspect the results. You can also visually inspect in burpsuite and build things more manually.

Adding rows with encoding to a Google Sheet R package googlesheets

I'm using this amazing package to be able to read and upload data with my shiny app. It's working ok, but when I add a row to the sheet, it does not keep the same encoding from server, neither behaves like the data in the previous rows. Spanish names I manually entered are OK, but when I use the app to load data, special latin characters (UTF-8) are replaced in the sheet.
That data, is not recognized by the app in the following sessions.
library(googlesheets)
table <- "Reportes"
saveData <- function(data) {
# Grab the Google Sheet
sheet <- gs_title(table)
# Add the data as a new row
gs_add_row(sheet, input = data)
}
loadData <- function() {
# Grab the Google Sheet
sheet <- gs_title(table)
# Read the data
gs_read_csv(sheet)
}
Then, I use a button in the UI, and an observer in the SERVER to load the data...
observeEvent(input$enviar, {
exit <- input$enviar
if (exit==1){
addData <- c( as.character(input$fecha),
as.character(input$local),
as.character(input$dpto),
as.character(input$estado),
as.character(input$fsiembra),
as.character(input$ref),
as.character(loc$lat[loc$Departamento==input$dpto & loc$Localidad==input$local]),
as.character(loc$long[loc$Departamento==input$dpto & loc$Localidad==input$local]),
as.character(getZafra(input$fecha)))
saveData(addData)
d <- loadData()
reset('fecha')
reset('dpto')
reset('local')
reset('estado')
reset('fsiembra')
reset('ref')
reset('pass')
disable('enviar')
}
})
Please... if anyone can help I'd be very happy.
I discovered that I needed to encode the character vector before uploding...
I used:
Encoding(addData) = "latin1"
saveData(addData)
and worked just fine!.

R tm package readPDF error in strptime(d, fmt) : input string too long

I would like to do text mining of the files on this website using the tm package. I am using the following code to download one of the files (i.e., abell.pdf) to my working directory and attempt to store the contents:
library("tm")
url <- "https://baltimore2006to2010acsprofiles.files.wordpress.com/2014/07/abell.pdf"
filename <- "abell.pdf"
download.file(url = url, destfile = filename, method = "curl")
doc <- readPDF(control = list(text = "-layout"))(elem = list(uri = filename),
language = "en", id = "id1")
But I receive the following error and warnings:
Error in strptime(d, fmt) : input string is too long
In addition: Warning messages:
1: In grepl(re, lines) : input string 1 is invalid in this locale
2: In grepl(re, lines) : input string 2 is invalid in this locale
The pdfs aren't particularly long (5 pages, 978 KB), and I have been able to successfully use the readPDF function to read in other pdf files on my Mac OSX. The information I want most (the total population for the 2010 census) is on the first page of each pdf, so I've tried shortening the pdf to just the first page, but I get the same message.
I am new to the tm package, so I apologize if I am missing something obvious. Any help is greatly appreciated!
Based on what I've read this error has something to do with the way that the "readPDF" function tries to make metadata for the file you're importing. Anyway, you can change the metadata info by using the "info" option. For example, I usually circumvent this error by modifying the command in the following way (using your code):
doc <- readPDF(control = list(info="-f",text = "-layout"))(elem = list(uri = filename),language = "en", id = "id1")
Where the addition of "info="-f"" is the only change. This doesn't really "fix" the problem, but it bypasses the error. Cheers :)

Resources