How to use a subquery in postgresql using R - r

query2 <- dbGetQuery(con,paste0("select uniqueno from
postgres.asset where location_id in (",dbGetQuery(con,"select
location_id from postgres.location where city_name
='",cityname,"')")
both the tables postgres.asset and postgres.location are postgres Table.
Error it shows for the inner query , as it is unable to find the connection
parameter.
Do let me know how to get rid of the error. I doubt the error comes for the misplacing of paste0 command

locationid <- dbGetQuery(con, paste0("select location_id from postgres.location where city_name = '",cityname ,"'"))
for(i in 1:nrow(locationid)){
uniqueid <- dbGetQuery(con, paste0("select uniqueno from postgres.asset where location_id = '",locationid[i,],"'"))}

Related

Error in Dataframe writing from R to Redshift

I have a dataframe in R with various different data types. While writing the dataframe from R to redshift server, I am getting error only with character and timestamp values. I am adding R code snippet below to give you more idea about the issue.
library(lubridate)
library(dplyr)
dat <- data.frame(id = letters[1:2], x = 2:3, date = now())
dat
str(dat)
drv <- dbDriver("PostgreSQL")
conn <- dbConnect(drv, host="redshift.amazonaws.com", port="5439", dbname="abcd", user="xyz", password="abc")
DBI::dbGetQuery(conn, "DROP TABLE test21;")
DBI::dbGetQuery(conn, "CREATE TABLE test21 ( id VARCHAR(255), x INT, date timestamp);")
chunksize = 100
for (i in 1:ceiling(nrow(dat)/chunksize)) {
query = paste0('INSERT INTO test21 (',paste0(colnames(dat),collapse = ','),') VALUES ')
vals = NULL
for (j in 1:chunksize) {
k = (i-1)*chunksize+j
if (k <= nrow(dat)) {
vals[j] = paste0('(', paste0(dat[k,],collapse = ','), ')')
}
}
query = paste0(query, paste0(vals,collapse=','))
DBI::dbExecute(conn, query)
}
While running the last part, I am getting the below error:
RS-DBI driver: (could not Retrieve the result : ERROR: column "date" is of type timestamp without time zone but expression is of type numeric
HINT: You will need to rewrite or cast the expression.
When I manually entered the values into the redshift table, it came as expected.
DBI::dbGetQuery(conn, "INSERT INTO test21 (id, x, date) values ('a','2','2019-02-08 15:21:08'),('b','3','2019-02-08 15:21:08')")
I am sensing that this issue is coming from some programmatic error. requesting your advise on the same where I am doing wrong with the code.
In the date field of your dataframe, try replacing
now()
with
substr(now(), 1, 19)

R text substitution in odbc sqlquery

I'm trying to use text substitutes in R to put in custom dates with my SQL odbc connect query.
For example, I could change date1 to be 2016-01-31 and the data would automatically execute. However, using bquote text replacement, it doesn't seem to work....
Any ideas?
library("rodbc")
date1 <- c("2016-12-31")
myconn <- odbcConnect("edwPROD",uid="username",pwd="BBBBB")
data1 <- sqlQuery(myconn,"
SELECT a.*
FROM (SELECT id
,status_code
,rate_plan
,publication
,active_count
FROM prod_view.fct_active
WHERE snap_start_date<=bquote(.date1)
) AS a
")
odbcClose(myconn)
This is a job for package infuser. It allows you to change one part of the SQL request, in this case date1.
library(infuser)
date1 <- c("2016-12-31")
sql_query_template <- "SELECT a.*
FROM (SELECT id
,status_code
,rate_plan
,publication
,active_count
FROM prod_view.fct_active
WHERE snap_start_date<='{{date1}}'
) AS a;"
sql_query <-infuse(sql_query_template, date1=date1)
myconn <- odbcConnect("edwPROD",uid="username",pwd="BBBBB")
data1 <- sqlQuery(myconn,sql_query)
odbcClose(myconn)

dbplyr - Error: x and y don't share the same src. Set copy = TRUE to copy y into x's source (this may be time consuming)

Normally we do not find any trouble in using the below connection method and run queries from redshift
require("RPostgreSQL")
drv <- dbDriver("PostgreSQL")
conn <- dbConnect(drv, dbname = "redshiftdb",
host = "XX.XX.XX.XX", port = "1234",
user = "userid", password = "pwd")
my_data <- dbGetQuery(conn, "select a.*, b.* from redshiftdb.schema1.table1 inner join redshiftdb.schema2.table2 on a.key = b.key")
But the problem with this method is people can use long complex SQL queries which becomes hard to debug and illustrate while re engineering. Unless I am hard core SQL coder.
I have been learning R language since September and I thought it would it would be interesting to use dplyr joins and pipes to do the same work.
I connected using
conn <- src_postgres(dbname = "redshiftdb",
host = "XX.XX.XX.XX", port = 1234,
user = "userid",
password = "pwd")
my_tbl1 <- tbl(conn, dplyr::sql('select * from schema1.table1'))
my_tbl2 <- tbl(conn, dplyr::sql('select * from schema1.table2'))
my_tbl3 <- tbl(conn, dplyr::sql('select * from schema1.table3'))
my_tbl4 <- tbl(conn, dplyr::sql('select * from schema1.table4'))
my_tbl5 <- tbl(conn, dplyr::sql('select * from schema2.table1'))
my_tbl6 <- tbl(conn, dplyr::sql('select distinct var1, var2 from schema2.table2'))
my_tbl7 <- tbl(conn, dplyr::sql('select * from schema2.table3'))
I get the above error using left_join and %>% to join across tables in schema1 as well as cross schema (i.e. schema1 & schema2)
When I use copy = TRUE it takes time and gives WARNING : only 1,00,000 records copied and it is really time consuming.
I have checked
https://github.com/hadley/dplyr/issues/244
but the pool method does not seem to work.
Any help would be much appreciated or else learning dplyr would be of no use to serve my immediate purpose.

Update selected rows in sqlite table in r

I am using the RSQLite package in a shiny app. I need to be able to dynamically update an sqlite db as users progress through the app. I want to use the UPDATE syntax in SQLite to achieve this, but I have come up against a problem when trying to update multiple rows for the same user.
Consider the following code:
# Load libraries
library("RSQLite")
## Path for SQLite db
sqlitePath <- "test.db"
# Create db to store tables
con <- dbConnect(SQLite(),sqlitePath)
## Create toy data
who <- c("jane", "patrick", "samantha", "jane", "patrick", "samantha")
tmp_var_1 <- c(1,2,3, 4, 5, 6)
tmp_var_2 <- c(2,4,6,8,10,12)
# Create original table
users <- data.frame(who = as.character(who), tmp_var_1 = tmp_var_1, tmp_var_2 = tmp_var_2)
users$who <- as.character(users$who)
# Write original table
dbWriteTable(con, "users", users)
# Subset users data
jane <- users[who=="jane",]
patrick <- users[who=="patrick",]
samantha <- users[who=="samantha",]
# Edit Jane's data
jane$tmp_var_1 <- c(99,100)
# Save edits back to SQL (this is where the problem is!)
table <- "users"
db <- dbConnect(SQLite(), sqlitePath)
query <- sprintf(
"UPDATE %s SET %s = ('%s') WHERE who = %s",
table,
paste(names(jane), collapse = ", "),
paste(jane, collapse = "', '"),
"'jane'"
)
dbGetQuery(db, query)
## Load data to check update has worked
loadData <- function(table) {
# Connect to the database
db <- dbConnect(SQLite(), sqlitePath)
# Construct the fetching query
query <- sprintf("SELECT * FROM %s", table)
# Submit the fetch query and disconnect
data <- dbGetQuery(db, query)
dbDisconnect(db)
data
}
loadData("users")
Here I am trying to update the entry for Jane so that the values for tmp_var_1 are changed, but all other columns remain the same. In response to questions from #zx8754 and #Altons posted below, the value for query is as follows:
UPDATE users SET who, tmp_var_1, tmp_var_2 = ('c(\"jane\", \"jane\")', 'c(99, 100)', 'c(2, 8)') WHERE who = 'jane'
The problem is almost certainly coming from the way that I am specifying the query to RSQlite. When I run dbGetQuery(db, query) I get the following error:
Error in sqliteSendQuery(con, statement, bind.data) :
error in statement: near ",": syntax error
Any suggestions for improvement would be most welcome.

Add to contents of vector to use in an RJDBC query

I'm using a RJDBC connection to query results from a vertica database into R. I'm creating a comma separated vector of zip codes that I'm then pasting into my query as shown below.
b <- paste("'20882'", "'01441'", "'20860'", "'02139'", sep = ", ")
SQL <- paste("select zip, count(*)
from tablea a
inner join tableb b on a.id = b.id
inner join tablec c on c.col = b.col
where b.zip in (",b'', ") group by 1 order by 1", '', sep = " ")
result <- dbGetQuery(vertica, SQL)
I'm using this in a loop within a function in which I'm going to be adding zip codes to vector b. I was wondering if there was a way to easily do this?
I've been trying, but I'm unable to add items to vector in a way where the query would execute.
Something like the following
b <- c(add_zip, b)
which could then be re-run in the body of the query.
Any suggestions?
Thanks,
Ben

Resources