Question: How do I pass a variable in the RPostgreSQL query?
Example: In the example below I try to pass the date '2018-01-03' to the query
library(RPostgreSQL)
dt <- '2018-01-03'
connect <- dbConnect(PostgreSQL(),
dbname="test",
host="localhost",
port=5432,
user="user",
password="...")
result <- dbGetQuery(connect,
"SELECT * FROM sales_tbl WHERE date = #{dt}")
You can use paste0 to generate your query and pass it to dbGetQuery:
library(RPostgreSQL)
dt <- '2018-01-03'
connect <- dbConnect(PostgreSQL(),
dbname="test",
host="localhost",
port=5432,
user="user",
password="...")
query <- paste0("SELECT * FROM sales_tbl WHERE date='", dt, "'")
result <- dbGetQuery(connect, query)
The safest way is to parameterize the query as mentioned here
Example:
library(RPostgreSQL)
dt <- '2018-01-03'
connect <- dbConnect(drv = PostgreSQL(),
dbname ="test",
host = "localhost",
port = 5432,
user = "user",
password = "...")
query <- "SELECT * FROM sales_tbl WHERE date= ?"
sanitized_query <- dbSendQuery(connect, query)
dbBind(sanitized_query, list(dt))
result <- dbFetch(sanitized_query)
Here by passing ? you are sanitizing your query to avoid SQL injection attacks.
Another thing I like to do is to create .Renviron file to store my credintials. For example, for the connection above, the .Renviron file will look like this.
dbname = test
dbuser = me
dbpass = mypass
dbport = 5432
dbhost = localhost
save the file, restart RStudio (to load the .Renviron file at startup). Then access the credentials using the Sys.getenv(variable)
#example:
connect <- dbConnect(drv = PostgreSQL(),
dbname = Sys.getenv("dbname"),
host = Sys.getenv("dbhost"),
port = Sys.getenv("dbport"),
user = Sys.getenv("dbuser"),
password = Sys.getenv("dbpass"))
Related
I'm running a postgreSQL query based on an automated list of ID's stored in an R list. I'm trying to determine how to include that R list in my query so I don't have to hard-code the ID's each time I run my query.
For example, I have a script that produces the list
id <- c("001","002","003")
and my query looks something like this:
SELECT *
FROM my_query
WHERE my_query.id_col IN ('001', '002', '003')
which I run using Rpostgres:
library(Rpostgres)
snappConnection <- DBI::dbConnect(RPostgres::Postgres(),
host = "host",
dbname = "dbname",
user = "user",
password = "pword",
port = 0000)
core.data <- dbGetQuery(conn = snappConnection,statement = SELECT * FROM my_query WHERE my_query.id_col IN ('001', '002', '003'))
Is there a way to reference my "id" list from R in my query so that when "id" updates to new values, the query also updates to those new values?
glue_sql from glue package should work:
query <- glue::glue_sql("
SELECT *
FROM my_query
WHERE my_query.id_col IN ({id*})
", .con = snappConnection)
core.data <- dbGetQuery(conn = snappConnection, statement = query)
#dave-edison's answer solved my problem. Concurrent to trying his, I got this to work.
I saved the query below as "my_query.sql"
SELECT *
FROM my_query
WHERE my_query.id_col IN ('string_to_replace')
then created a string and used gsub on the string.
library(tidyverse)
temp.script <- read_file("my_query.sql")
core.data.script <- gsub('string_to_replace',paste0(id,collapse = "', '"),temp.script)
From there I just ran my RPostgres script like above.
I am trying to load data to snowflake using the following code, but getting an error.
con <- DBI::dbConnect(
drv = odbc::odbc(),
driver = "SnowflakeDSIIDriver",
server = "<>",
authenticator = 'externalbrowser',
warehouse = "<>",
database = "<>",
UID = "<>",
role = "<>"
)
DBI::dbAppendTable(con, name = DBI::Id(schema = "<>", table = "<>"), value = tmp[1:2,])
tmp was downloaded from Snowflake, the same table using RStudio:
```{sql connection=con, output.var = 'tmp'}
select top 10 *
FROM <>
```
The error seems to be stemming from a VARIANT column where I store a JSON string.
Error in new_result(connection#ptr, statement, immediate) :
nanodbc/nanodbc.cpp:1374: 22000: SQL compilation error:
Expression type does not match column data type, expecting VARIANT but got VARCHAR(2) for column FEATURES
I had this once and it was an invalid JSON (missing brackets somewhere). Probably this helps.
I am loading a dataset called df and it is 10 GB file , and it (dbGetQuery) is taking 6-8 minutes . Any suggestions to expedite this data loading script ?
conn <- dbConnect(MySQL(),
host = DB$host,
port = DB$port,
user = DB$user,
password = DB$password,
dbname = "device_onboard")
df<-dbGetQuery(conn, "select * from device_onboard.onboard")
When I run the insert statement with odbc driver everythings fine.
drv <- odbc::odbc()
conn <- createConn(drv, trusted_connection = T, dsn="mydsn", uid="myuid", pwd="mypwd")
DBI::dbSendQuery("INSERT INTO \"dbo\".\"testjdbc\" (d) values('4')")
When I run the select statement with jdbc everything is fine too:
drv <- RJDBC::JDBC(driverClass = "com.microsoft.sqlserver.jdbc.SQLServerDriver", classPath = "C:\\mssql-jdbc-7.0.0.jre8.jar")
conn <- DBI::dbConnect(drv, trusted_connection = T, url = "jdbc:sqlserver://myserver\\\\myinstance:1111;databaseName=mydatabasename", user="myuid", password="mypwd")
DBI::dbGetQuery(conn, "Select * from dbo.mytable")
and the connection for jdbc is valid:
drv <- RJDBC::JDBC(driverClass = "com.microsoft.sqlserver.jdbc.SQLServerDriver", classPath = "C:\\mssql-jdbc-7.0.0.jre8.jar")
conn <- DBI::dbConnect(drv, trusted_connection = T, url = "jdbc:sqlserver://myserver\\\\myinstance:1111;databaseName=mydatabasename", user="myuid", password="mypwd")
DBI::dbIsValid(conn) # TRUE
But when I try insert statement (the same like the first one) with jdbc driver like this:
drv <- RJDBC::JDBC(driverClass = "com.microsoft.sqlserver.jdbc.SQLServerDriver", classPath = "C:\\mssql-jdbc-7.0.0.jre8.jar")
conn <- DBI::dbConnect(drv, trusted_connection = T, url = "jdbc:sqlserver://myserver\\\\myinstance:1111;databaseName=mydatabasename", user="myuid", password="mypwd")
DBI::dbSendQuery(conn, "INSERT INTO \"dbo\".\"testjdbc\" (d) values('4')")
then I get the error:
Error in .verify.JDBC.result(r, "Unable to retrieve JDBC result set for ", :
Unable to retrieve JDBC result set for INSERT INTO "dbo"."testjdbc" (d) values('4') (The statement did not return a result set.)
So jdbc select is OK but inserts, updates, deletes gives errors while with odbc I can do everything.
The solution is to run inserts not with DBI::dbSendQuery but with RJDBC::dbSendUpdate.
Thank you #Mark Rotteveel for your answer. Thanks to you I have found the solution.
I'm trying to connect to a database in Redshift with my mac.
I managed to connect to Redshift with both dplyr and RPostgreSQL, but even though i can see all the available tables regardless of schema, i'm unable to access any of them as they all are under diiferent schemas.
I've tried all sorts of syntax to specify the schema but i'm not getting anywhere.
Here's my RPostgreSQL code:
library(RPostgreSQL)
drv <- dbDriver("PostgreSQL")
postgre.conn <-dbConnect(drv,
host="localhost", port="XXXX", dbname="redshiftdb",
user="XXXX", password="XXXX")
dbListTables(postgre.conn)
This lists all the tables regardless of schema.
I can see all the tables under a specific schema so this works:
dbGetQuery(postgre.conn,
"SELECT table_name FROM information_schema.tables
WHERE table_schema='my_schema'")
but i'm then unable to access data from my_schema.my_table with any of these commands:
dbSendQuery(postgre.conn,"SELECT * FROM my_table LIMIT 10")
dbSendQuery(postgre.conn,"SELECT * FROM my_schema.my_table LIMIT 10")
dbSendQuery(postgre.conn,"SELECT * FROM my_table WHERE table_schema='my_schema' LIMIT 10")
dbSendQuery(postgre.conn,"SELECT * FROM c("my_schema", "my_table") LIMIT 10")
Similarly here's my dplyr code:
library(dplyr)
dplyr.conn <- src_postgres(host="localhost", port="XXXX",
dbname = "redshiftdb", user = "XXXX", password = "XXXX")
head(src_tbls(dplyr.conn)) # lists all the tables, regardless of schema
But then, none of these work:
tbl(dplyr.conn, sql("SELECT * FROM my_table LIMIT 10"))
tbl(dplyr.conn, sql("SELECT * FROM my_schema.my_table LIMIT 10"))
and i also tried specifying the search path in both cases as such:
dplyr.conn <- src_postgres(host="localhost", port="XXXX",
dbname = "redshiftdb", user = "XXXX", password = "XXXX",
options="-c search_path=my_schema")
postgre.conn <-dbConnect(drv,
host="localhost",
port="XXXX",
dbname="redshiftdb",
user="XXXX",
password="XXXX",
options="-c search_path=my_schema")
but these still didn't work:
tbl(dplyr.conn, sql("SELECT * FROM my_table LIMIT 10"))
dbSendQuery(postgre.conn,"SELECT * FROM my_table LIMIT 10")
any ideas...?
Use the in_schema() command. The code would be something like:
t <- tbl(dplyr.conn, in_schema("sheman_name", "table_name")
library(glue)
schema <- "your_schema"
tbl <- "your_table"
var <- "your_var"
conn <- "your_connection_to_database"
select_query <- glue_sql('
SELECT {`var`}
FROM {`schema`}.{`tbl`} ', .con = conn)
DBI::dbGetQuery(conn, select_query)