query2 <- dbGetQuery(con,paste0("select uniqueno from
postgres.asset where location_id in (",dbGetQuery(con,"select
location_id from postgres.location where city_name
='",cityname,"')")
both the tables postgres.asset and postgres.location are postgres Table.
Error it shows for the inner query , as it is unable to find the connection
parameter.
Do let me know how to get rid of the error. I doubt the error comes for the misplacing of paste0 command
locationid <- dbGetQuery(con, paste0("select location_id from postgres.location where city_name = '",cityname ,"'"))
for(i in 1:nrow(locationid)){
uniqueid <- dbGetQuery(con, paste0("select uniqueno from postgres.asset where location_id = '",locationid[i,],"'"))}
Related
I have a dataframe in R with various different data types. While writing the dataframe from R to redshift server, I am getting error only with character and timestamp values. I am adding R code snippet below to give you more idea about the issue.
library(lubridate)
library(dplyr)
dat <- data.frame(id = letters[1:2], x = 2:3, date = now())
dat
str(dat)
drv <- dbDriver("PostgreSQL")
conn <- dbConnect(drv, host="redshift.amazonaws.com", port="5439", dbname="abcd", user="xyz", password="abc")
DBI::dbGetQuery(conn, "DROP TABLE test21;")
DBI::dbGetQuery(conn, "CREATE TABLE test21 ( id VARCHAR(255), x INT, date timestamp);")
chunksize = 100
for (i in 1:ceiling(nrow(dat)/chunksize)) {
query = paste0('INSERT INTO test21 (',paste0(colnames(dat),collapse = ','),') VALUES ')
vals = NULL
for (j in 1:chunksize) {
k = (i-1)*chunksize+j
if (k <= nrow(dat)) {
vals[j] = paste0('(', paste0(dat[k,],collapse = ','), ')')
}
}
query = paste0(query, paste0(vals,collapse=','))
DBI::dbExecute(conn, query)
}
While running the last part, I am getting the below error:
RS-DBI driver: (could not Retrieve the result : ERROR: column "date" is of type timestamp without time zone but expression is of type numeric
HINT: You will need to rewrite or cast the expression.
When I manually entered the values into the redshift table, it came as expected.
DBI::dbGetQuery(conn, "INSERT INTO test21 (id, x, date) values ('a','2','2019-02-08 15:21:08'),('b','3','2019-02-08 15:21:08')")
I am sensing that this issue is coming from some programmatic error. requesting your advise on the same where I am doing wrong with the code.
In the date field of your dataframe, try replacing
now()
with
substr(now(), 1, 19)
I'm trying to use text substitutes in R to put in custom dates with my SQL odbc connect query.
For example, I could change date1 to be 2016-01-31 and the data would automatically execute. However, using bquote text replacement, it doesn't seem to work....
Any ideas?
library("rodbc")
date1 <- c("2016-12-31")
myconn <- odbcConnect("edwPROD",uid="username",pwd="BBBBB")
data1 <- sqlQuery(myconn,"
SELECT a.*
FROM (SELECT id
,status_code
,rate_plan
,publication
,active_count
FROM prod_view.fct_active
WHERE snap_start_date<=bquote(.date1)
) AS a
")
odbcClose(myconn)
This is a job for package infuser. It allows you to change one part of the SQL request, in this case date1.
library(infuser)
date1 <- c("2016-12-31")
sql_query_template <- "SELECT a.*
FROM (SELECT id
,status_code
,rate_plan
,publication
,active_count
FROM prod_view.fct_active
WHERE snap_start_date<='{{date1}}'
) AS a;"
sql_query <-infuse(sql_query_template, date1=date1)
myconn <- odbcConnect("edwPROD",uid="username",pwd="BBBBB")
data1 <- sqlQuery(myconn,sql_query)
odbcClose(myconn)
Normally we do not find any trouble in using the below connection method and run queries from redshift
require("RPostgreSQL")
drv <- dbDriver("PostgreSQL")
conn <- dbConnect(drv, dbname = "redshiftdb",
host = "XX.XX.XX.XX", port = "1234",
user = "userid", password = "pwd")
my_data <- dbGetQuery(conn, "select a.*, b.* from redshiftdb.schema1.table1 inner join redshiftdb.schema2.table2 on a.key = b.key")
But the problem with this method is people can use long complex SQL queries which becomes hard to debug and illustrate while re engineering. Unless I am hard core SQL coder.
I have been learning R language since September and I thought it would it would be interesting to use dplyr joins and pipes to do the same work.
I connected using
conn <- src_postgres(dbname = "redshiftdb",
host = "XX.XX.XX.XX", port = 1234,
user = "userid",
password = "pwd")
my_tbl1 <- tbl(conn, dplyr::sql('select * from schema1.table1'))
my_tbl2 <- tbl(conn, dplyr::sql('select * from schema1.table2'))
my_tbl3 <- tbl(conn, dplyr::sql('select * from schema1.table3'))
my_tbl4 <- tbl(conn, dplyr::sql('select * from schema1.table4'))
my_tbl5 <- tbl(conn, dplyr::sql('select * from schema2.table1'))
my_tbl6 <- tbl(conn, dplyr::sql('select distinct var1, var2 from schema2.table2'))
my_tbl7 <- tbl(conn, dplyr::sql('select * from schema2.table3'))
I get the above error using left_join and %>% to join across tables in schema1 as well as cross schema (i.e. schema1 & schema2)
When I use copy = TRUE it takes time and gives WARNING : only 1,00,000 records copied and it is really time consuming.
I have checked
https://github.com/hadley/dplyr/issues/244
but the pool method does not seem to work.
Any help would be much appreciated or else learning dplyr would be of no use to serve my immediate purpose.
I am using the RSQLite package in a shiny app. I need to be able to dynamically update an sqlite db as users progress through the app. I want to use the UPDATE syntax in SQLite to achieve this, but I have come up against a problem when trying to update multiple rows for the same user.
Consider the following code:
# Load libraries
library("RSQLite")
## Path for SQLite db
sqlitePath <- "test.db"
# Create db to store tables
con <- dbConnect(SQLite(),sqlitePath)
## Create toy data
who <- c("jane", "patrick", "samantha", "jane", "patrick", "samantha")
tmp_var_1 <- c(1,2,3, 4, 5, 6)
tmp_var_2 <- c(2,4,6,8,10,12)
# Create original table
users <- data.frame(who = as.character(who), tmp_var_1 = tmp_var_1, tmp_var_2 = tmp_var_2)
users$who <- as.character(users$who)
# Write original table
dbWriteTable(con, "users", users)
# Subset users data
jane <- users[who=="jane",]
patrick <- users[who=="patrick",]
samantha <- users[who=="samantha",]
# Edit Jane's data
jane$tmp_var_1 <- c(99,100)
# Save edits back to SQL (this is where the problem is!)
table <- "users"
db <- dbConnect(SQLite(), sqlitePath)
query <- sprintf(
"UPDATE %s SET %s = ('%s') WHERE who = %s",
table,
paste(names(jane), collapse = ", "),
paste(jane, collapse = "', '"),
"'jane'"
)
dbGetQuery(db, query)
## Load data to check update has worked
loadData <- function(table) {
# Connect to the database
db <- dbConnect(SQLite(), sqlitePath)
# Construct the fetching query
query <- sprintf("SELECT * FROM %s", table)
# Submit the fetch query and disconnect
data <- dbGetQuery(db, query)
dbDisconnect(db)
data
}
loadData("users")
Here I am trying to update the entry for Jane so that the values for tmp_var_1 are changed, but all other columns remain the same. In response to questions from #zx8754 and #Altons posted below, the value for query is as follows:
UPDATE users SET who, tmp_var_1, tmp_var_2 = ('c(\"jane\", \"jane\")', 'c(99, 100)', 'c(2, 8)') WHERE who = 'jane'
The problem is almost certainly coming from the way that I am specifying the query to RSQlite. When I run dbGetQuery(db, query) I get the following error:
Error in sqliteSendQuery(con, statement, bind.data) :
error in statement: near ",": syntax error
Any suggestions for improvement would be most welcome.
I'm using a RJDBC connection to query results from a vertica database into R. I'm creating a comma separated vector of zip codes that I'm then pasting into my query as shown below.
b <- paste("'20882'", "'01441'", "'20860'", "'02139'", sep = ", ")
SQL <- paste("select zip, count(*)
from tablea a
inner join tableb b on a.id = b.id
inner join tablec c on c.col = b.col
where b.zip in (",b'', ") group by 1 order by 1", '', sep = " ")
result <- dbGetQuery(vertica, SQL)
I'm using this in a loop within a function in which I'm going to be adding zip codes to vector b. I was wondering if there was a way to easily do this?
I've been trying, but I'm unable to add items to vector in a way where the query would execute.
Something like the following
b <- c(add_zip, b)
which could then be re-run in the body of the query.
Any suggestions?
Thanks,
Ben