Using variable in "IN" function of SQL query in R - r

I am having a variable x which contains 20000 IDs. I want to write a sql query like,
select * from tablename where ID in x;
I am trying to implement this in R where I can get the values only for IDs in x variable. The following is my try,
dbSendQuery(mydb, "select * from tablename where ID in ('$x') ")
I am not getting any error while trying this. But it is returning 0 values.
Next tried using
sprintf("select * from tablename where ID in %s",x)
But this is creating 20000 individual queries which could prove costly in DB.
Can anybody suggest me a way to write a command, which would loop through IDs in x and save to a Dataframe in R in a single query?

You need to have the codes in the actual string. Here is how I would do it with gsub
x <- LETTERS[1:3]
sql <- "select * from tablename where ID in X_ID_CODES "
x_codes <- paste0("('", paste(x, collapse="','"), "')")
sql <- gsub("X_ID_CODES", x_codes, sql)
# see new output
cat(sql)
select * from tablename where ID in ('A','B','C')
# then submit the query
#dbSendQuery(mydb, sql)

How about pasting it:
dbSendQuery(mydb, paste("select * from tablename where ID in (", paste(x, collapse = ","), ")"))

Related

Insert R list into RPostgreSQL query

I'm running a postgreSQL query based on an automated list of ID's stored in an R list. I'm trying to determine how to include that R list in my query so I don't have to hard-code the ID's each time I run my query.
For example, I have a script that produces the list
id <- c("001","002","003")
and my query looks something like this:
SELECT *
FROM my_query
WHERE my_query.id_col IN ('001', '002', '003')
which I run using Rpostgres:
library(Rpostgres)
snappConnection <- DBI::dbConnect(RPostgres::Postgres(),
host = "host",
dbname = "dbname",
user = "user",
password = "pword",
port = 0000)
core.data <- dbGetQuery(conn = snappConnection,statement = SELECT * FROM my_query WHERE my_query.id_col IN ('001', '002', '003'))
Is there a way to reference my "id" list from R in my query so that when "id" updates to new values, the query also updates to those new values?
glue_sql from glue package should work:
query <- glue::glue_sql("
SELECT *
FROM my_query
WHERE my_query.id_col IN ({id*})
", .con = snappConnection)
core.data <- dbGetQuery(conn = snappConnection, statement = query)
#dave-edison's answer solved my problem. Concurrent to trying his, I got this to work.
I saved the query below as "my_query.sql"
SELECT *
FROM my_query
WHERE my_query.id_col IN ('string_to_replace')
then created a string and used gsub on the string.
library(tidyverse)
temp.script <- read_file("my_query.sql")
core.data.script <- gsub('string_to_replace',paste0(id,collapse = "', '"),temp.script)
From there I just ran my RPostgres script like above.

R, ClickHouse: Expected: FixedString(34). Got: UInt64: While processing

I am trying to query data from ClickHouse database from R with subset.
Here is the example
library(data.table)
library(RClickhouse)
library(DBI)
subset <- paste(traffic[,unique(IDs)][1:30], collapse = ',')
conClickHouse <- DBI::dbConnect('here is the connection')
DataX <- dbgetdbGetQuery(conClickHouse, paste0("select * from database
and IDs in (", subset ,") ", sep = "") )
As a result I get error:
DB::Exception: Type mismatch in IN or VALUES section. Expected: FixedString(34).
Got: UInt64: While processing (IDs IN ....
Any help is appreciated
Thanks to the comment of #DennyCrane,
"select * from database where toFixedString(IDs,34) in
(toFixedString(ID1, 34), toFixedString(ID2,34 ))"
This query subset properly
https://clickhouse.tech/docs/en/sql-reference/functions/#strong-typing
Strong Typing
In contrast to standard SQL, ClickHouse has strong typing. In other words, it doesn’t make implicit conversions between types. Each function works for a specific set of types. This means that sometimes you need to use type conversion functions.
https://clickhouse.tech/docs/en/sql-reference/functions/type-conversion-functions/#tofixedstrings-n
select * from (select 'x' B ) where B in (select toFixedString('x',1))
DB::Exception: Types of column 1 in section IN don't match: String on the left, FixedString(1) on the right.
use casting toString or toFixedString
select * from (select 'x' B ) where toFixedString(B,1) in (select toFixedString('x',1))

Bringing a R dataframe into a sql usable list with paste function

I have a dataframe/list in R(Rdataframe) which I want to use directly in RODBC query something like
Rdataframe= c('123456','234561','678912')
a= sqlQuery(connection, "Select * from table A where A.Id in Rdataframe")
And the query has to be like that, i.e. I can't pull the table first in R and then do the look up
So I think it can only run if it will come in the format such as
a= sqlQuery(connection, "Select * from table A where A.Id in ('123456','234561','678912'))
But despite of several attempts of sprintf & paste I am still not successful.
This is what I tried to attempt but failed
attempt1= sqlQuery(connection, sprintf("Select * from table A where A.Id in %s", Rdataframe))
attempt2=paste(Rdataframe, sep=",")
And then using this attempt 2 structure in query.
Every help is important
Rdataframe= c('123456' , '234561' , '678912')
df_str = paste(Rdataframe , collapse = "','" , sep=" ")
queryStr = paste("Select * from table A where A.Id in ('" ,df_str , "')" , sep="")
print(queryStr)
Gives the output
[1] "Select * from table A where A.Id in ('123456','234561','678912')"

RMySQL retrieving data from database table basing on an if statement

I am trying to retrieve data from a database table based on a given condition:
I want to select all from a table, and during the while loop I put a condition like to return only what I want the way it is done in PHP
mytable <- dbSendQuery(con, "select date from member")
while(!dbHasCompleted(mytable)){
if(name = 'myname'){
new_date <- dbFetch(mytable, n=-1)
print(mytable)
}
}
How do I deal with the if statements to operate well?
I later found out the solution like this
mytable <- dbSendQuery(con, "select * from member")
while(!dbHasCompleted(mytable)){
new_date <- dbFetch(mytable, name = 'john', n=-1)
print(new_date )
}

R programming: RODBC and dataframe

Having trouble understanding how to pass a dataframe's column value to the query? The query works fine if I pass the value directly.
The values are stored in df$number. The tables names are changed for sharing purposes:
dataframe <- sqlQuery(connection, "
+ SELECT
+ dimension1,
+ FROM
+ table1,
+ WHERE
+ dimension1 = df$number ")
I get the error [1] "42S22 904 [Oracle][ODBC][Ora]ORA-00904: \"df$number\": invalid identifier\n"
The gsubfn package can perform quasi-perl-style string interpolation. Preface any function with fn$ to turn it on for that function's arguments:
library(gsubfn)
num <- 3
dataframe <- fn$sqlQuery(connection,
"SELECT dimension1 FROM table1 WHERE dimension1 = $num ")
Alternately use paste or sprintf to construct the string:
sql <- paste("SELECT dimension1 FROM table1 WHERE dimension1 =", num)
sqlQuery(connection, sql)
or
sql <- sprintf("SELECT dimension1 FROM table1 WHERE dimension1 = %d", num)
sqlQuery(connection, sql)
dataframe <- sqlQuery(connection, paste("
+ SELECT
+ dimension1,
+ FROM
+ table1,
+ WHERE
+ dimension1 =", df$number))
The SQL statement is a string, so it cannot access your dataframe's value. To construct a string which uses the value use:
sql.base<-"SELECT dimension1 FROM table1 WHERE dimension1 = "
sql.completed<-paste0(sql.base,df$number)
sqlQuery(connection, sql.completed)
If you're looking to do a parameterized SQL query i.e. you have a number of rows in df, you can either collapse the field values with a comma seperator and put them in side brackets within a SQL IN query. Alternatively you can use the package RODBCext which will allow to do it like
sqlExecute(connection, "SELECT dimension1 FROM table1 WHERE dimension1 = ?", df$number, fetch = TRUE)
For more info on RODBCext and this type of SQL, check out their vignette

Resources