Authentication for Bigquery using bigrquery from an R Markdown document - r

I am having problems using bigrquery to connect to a GCP service account from within an R Markdown document that I knit. When I attempt from the console, authentication works fine. Both
library(bigrquery)
bq_auth()
and
library(bigrquery)
bq_auth(email="my-service-account-email#myproject.iam.gserviceaccount.com")
launch a browser with a dialog that lets me pick and authenticate using the specified account as expected. But in the R Markdown, any attempt like
options("httr_oob_default" = TRUE)
bq_auth(email="my-service-account-email#myproject.iam.gserviceaccount.com")
or even using the full list like this
bq_auth(
email = "my-service-account-email#myproject.iam.gserviceaccount.com",
path = NULL,
scopes = c("https://www.googleapis.com/auth/bigquery"),
cache = gargle::gargle_oauth_cache(),
use_oob = gargle::gargle_oob_default(),
token = NULL
)
leads to the error
Error: Can't get Google credentials.
Are you running bigrquery in a non-interactive session? Consider:
* Call `bq_auth()` directly with all necessary specifics.
Can anyone see what I am missing? Thanks in advance.

You can download the JSON file of your Google Cloud service account, then use it as a path that the “bq_auth” function can recognize. Here's the steps:
Google Cloud Console (console.cloud.google.com)
Navigation Menu
IAM & Admin Service
Accounts
Create Service Account (create one)
Create Key, and save to "/path/to/jsonfilename.json"
Authenticate in your R Markdown code: bigrquery::bq_auth(path = "/path/to/jsonfilename.json")
Note: you'll need to make sure to set the service account to have access to BigQuery. I set mine to "BigQuery Admin" and it worked, but that might be too broad
Borrowed this answer from Elaine See's post on medium: https://medium.com/#elaine.yl.see/easiest-way-to-use-bigquery-in-r-8af466cd55ca

Related

Authentication for Bigquery using bigrquery from an R in Google Colab

I try to access my own data tables stored on Google BigQuery in my Google Colab sheet (with a R runtime) by running the following code:
# install.packages("bigrquery")
library("bigrquery")
bq_auth(path = "mykeyfile.json")
projectid = "work-366734"
sql <- "SELECT * FROM `Output.prepared_data`"
Running
tb <- bq_project_query(projectid, sql)
results in the following access denied error:
Access Denied: BigQuery BigQuery: Permission denied while globbing file pattern. [accessDenied]
For clarification, I already created a service account (under Google Cloud IAM and admin), gave it the roles ‘BigQuery Admin’ and ‘BigQuery Data Owner’, and extracted the above-mentioned json Key file ‘mykeyfile.json’. (as suggested here)
Additionally, I added the Role of the service account to the dataset (BigQuery – Sharing – Permissions – Add Principal), but still, the same error shows up…
Of course, I already reset/delete and reinitialized the runtime.
Am I missing giving additional permissions somewhere else?
Thanks!
Not sure if it is relevant, but I add it just in case: I also tried the authentication process via
bq_auth(use_oob = TRUE, cache = FALSE)
which opens an additional window, where I have to allow access (using my Google Account, which is also the Data Owner) and enter an authorization code. While this steps works, bq_project_query(projectid, sql) still gives the same Access Denied error.
Trying to authorize access to Google BigQuery using python and the following commands, works flawless (using the same account/credentials).
from google.colab import auth
auth.authenticate_user()
project_id = "work-366734"
client = bigquery.Client(project=project_id)
df = client.query( '''
SELECT
*
FROM
`work-366734.Output.prepared_data`
''' ).to_dataframe()

Trying to deploy shiny app with a google drive connection

My code uses a file located in the g-drive and I am having issues deploying the R-Shiny app to shinyapps.io because of this connection
The script works locally but I get the following error when I try to deploy:
"Error: Can't get google credentials
Are you runing googledrive in a non-interactive session? Consider:
drive_deauth() or drive_auth()..."
Drive deauth gives me a 403 error regarding credentials to my own g-drive and the drive_auth() are giving me the similar error above, even when i pass all the different arguments in the documentation
my latest atempt
drive_auth_config(active = FALSE)
drive_find()
drive_download(
"CST_Tree.csv",
path = "..\\Shiny\\CST_Tree.csv",
overwrite = TRUE
)
df <- read.csv("CST_Tree.csv")
See my thread here.
I struggled for a while to get it working as I didn't (and still don't!) understand OAuth tokens.
My understanding of the GoogleDrive and GoogleSheets packages are that they come "pre-configured" with a "public" token and kind of work "out-of-the-box", but you may run into problems with API limitations, since there are lots of people using it.
I found it best to set up my own Google API Account and use that to authorise the use of GoogleDrive and GoogleSheets in my Shiny App.
It's not terribly straightforward, but hopefully my thread link helps?
First set authorities for google drive (this should go in your app.R):
options(
# whenever there is one account token found, use the cached token
gargle_oauth_email = TRUE,
# specify auth tokens should be stored in a hidden directory ".secrets"
gargle_oauth_cache = "Your project folder name/.secrets"
)
Then outside your app.R, be sure you store the tokens for google drive and google sheets:
googledrive::drive_auth()
googlesheets4::gs4_auth()
For more instruction try:
https://www.jdtrat.com/blog/connect-shiny-google/

Connect to googlesheets via shiny in R with googlesheets4

I'm trying to use an updated version of this example to connect to a private googlesheet via shiny, and deploy this app on the shinyapps.io server. The user is not required to authenticate to a google account as the app uses a specified pre-existing googlesheet.
I've followed this example (partly copied here), attempting to save the token to my shiny app:
# previous googlesheets package version:
shiny_token <- gs_auth() # authenticate w/ your desired Google identity here
saveRDS(shiny_token, "shiny_app_token.rds")
but tried to update it to googlesheets4, like this:
ss <- gs4_get("MY GOOGLE DOC URL") # do the authentication once, manually.
ss
gs4_has_token() # check that the token exists
# get token
ss_token <- gs4_token()
# save the token
save(ss_token, file = "APP PATH ... /data/tk.rdata")
Then in the app, I have placed this code outside the shinyApp() function.
load("data/tk.rdata")
googlesheets4::gs4_auth(token = ss_token, use_oob = T)
In the app, I connect to a google doc from the app, using a hardcoded id obtained from
ss$spreadsheet_id above. The app works locally.
After attempting to deploy the app to the server I get the error "...Can't get google credentials. Are you running googlesheets4 in a non-interactive session?... etc" I thought that the token would contain sufficient information for this.
I'd be grateful if anyone can point me to a guide to setting this up, and also comment on whether this approach (saving a token on the shinyapps.io) is safe?
I've looked at other examples, but it seems most are for the previous version of googlesheets
On 21-Jul-2021 googlesheets4 deprecated some of its function when releasing v1.0.0.
I have updated volfi's answer to work with googlesheets4 v1.0.0.
It also works when deploying to shinyapps.io.
Set up non-interactive authentication
library(googlesheets4)
# Set authentication token to be stored in a folder called `.secrets`
options(gargle_oauth_cache = ".secrets")
# Authenticate manually
gs4_auth()
# If successful, the previous step stores a token file.
# Check that a file has been created with:
list.files(".secrets/")
# Check that the non-interactive authentication works by first deauthorizing:
gs4_deauth()
# Authenticate using token. If no browser opens, the authentication works.
gs4_auth(cache = ".secrets", email = "your#email.com")
Example - add data to Gooogle Sheet
Create a Google Sheet on Google Sheets and copy the sheet's url.
library(googlesheets4)
gs4_auth(cache=".secrets", email="your#email.com")
ss <- gs4_get("https://docs.google.com/path/to/your/sheet")
sheet_append(ss, data.frame(time=Sys.time()))
If deploying your app to shinyapps.io make sure to deploy the file in the .secrets folder.
Just follow the instructions in this link:
# designate project-specific cache
options(gargle_oauth_cache = ".secrets")
# check the value of the option, if you like
gargle::gargle_oauth_cache()
# trigger auth on purpose to store a token in the specified cache
# a broswer will be opened
googlesheets4::sheets_auth()
# see your token file in the cache, if you like
list.files(".secrets/")
# sheets reauth with specified token and email address
sheets_auth(
cache = ".secrets",
email = "youremail"
)
I am posting here because I started from this thread on this journey, and want to share what finally worked after many hours of having a go, reading gargle, googledrive, and googlesheets4 documentation and oh so many other posts on this issue.
I first used the googlesheets4 method gs4_auth() to obtain a credential and stored it in a .secrets folder. As described in this thread and here. This worked on my desktop and I was excited. It did not work on shinyapps.io or on my Ubuntu 18.4 instance of shiny-server that I have on an AWS EC2 instance. The error was something like this:
"Error in ... : Can't get Google credentials.Are you running googledrive in a non-interactive session? Consider: drive_deauth() to prevent the attempt to get credentials. Call drive_auth() directly with all necessary specifics."
Then I tried an approach starting from here and taking me to here
Somehow this did work on shinyapps.io but still not on my Ubuntu shiny server.
This worked: I pursued a Google service account approach as described here and created a project, then a service account for the project, added Google Sheets API to the project, then downloaded a key as a JSON file. I then used at the top of my app_server.R file googlesheets4::gs4_auth(path = './<path to hidden JSON file folder I called .token>/.token/<JSON key file>.json'). This still did not work until the final step that is not clearly explained almost anywhere I looked which is to go to the Google sheet in question, and "share" it with the client_email email address from the JSON key file, giving it editor permissions, in my case. This was finally well explained in this random article: https://robocorp.com/docs/development-guide/google-sheets/interacting-with-google-sheets
Finally read and write access for my app from shiny server on my AWS server instance. I really hope someone finds this useful.

Non-interactive authentication for googlesheets4 using json path

Although it's not my main language, I'm using R to post results from a daily ETL into a Google Sheets worksheet. Because this is going to be a scheduled job that runs in perpetuity, I'm hesitant to use the interactive flow for authenticating Google Drive from R. I have the path for my JSON from my Google Drive account credentials, but when I pass
gs4_deauth()
gs4_has_token()
gs4_auth(email=<email-string>, path = jsonlite::fromJSON(<path-to-json>), cache = NULL)
it throws the following error:
[1] FALSE
Error: Can't get Google credentials.
Are you running googlesheets4 in a non-interactive session? Consider:
* `gs4_deauth()` to prevent the attempt to get credentials.
* Call `gs4_auth()` directly with all necessary specifics.
See gargle's "Non-interactive auth" vignette for more details:
https://gargle.r-lib.org/articles/non-interactive-auth.html
Execution halted
Since the error message doesn't provide much specifics, I'm wondering if anyone has experience with this or has any ideas why this could be occurring? I'm using gs4_deauth() because I had originally logged in using the interactive flow, but want to now ensure that non-interactive authentication works.
As an aside, what's the least painful way to make a token object from the json credentials file? I believe that may be simpler than repeatedly passing the json credentials directly using the path argument.
In case of service account credential:
googlesheets4::gs4_auth(
path = "service_accout.json",
scopes = "https://www.googleapis.com/auth/spreadsheets.readonly"
)
## NOTE: the spreadsheet is shared with the service account
s <- googlesheets4::read_sheet(ID_OF_YOUR_SPREADSHEET)
Token can be re-used in this way:
token <- googlesheets4::gs4_token()
googledrive::drive_auth(token = token)
googledrive::drive_has_token()
[1] TRUE

How to knit dynamic reports with Google Analytics (rga)

I'm using rga to get some data from Google Analytics. From the repo:
The principle of this package is to create an instance of the API Authentication, which is a S4/5-class (utilizing the setRefClass). This instance then contains all the functions needed to extract data, and all the data needed for the authentication and reauthentication. The class is in essence self sustaining.
The package creates and saves a local instance using:
rga.open(instance="ga", where="~/ga.rga")
When I try to knit, however, I get an error that the ga object (what would be the instance) is not found. The code works when I run the chunks in RStudio, however—I believe the error is related to this aspect:
[The command above] will check if the instance is already created, and if it is, it'll prepare the token. If the instance is not created [...] it will redirect the client to a browser for authentication with Google.
My guess is that knitr can't perform that last step and so, the object is never created.
How can I make this work? I'm thinking that there might be a way to load the local ga.rga file to bypass browser authentication.
You can bypass browser authentication by passing the client id and client secret key that you can get it from Google API console. Saving a local auth file in the dev env is always risky. You can try this code, this uses Google API and also saves the local instance -
rga.open(instance = "ga",
client.id = "<contains apps.googleusercontent.com>",
client.secret =<your secret key>, where ="~/ga.rga" )
Also ensure that desktop option setting is enabled in Google API console

Resources