R programming: RODBC and dataframe - r

Having trouble understanding how to pass a dataframe's column value to the query? The query works fine if I pass the value directly.
The values are stored in df$number. The tables names are changed for sharing purposes:
dataframe <- sqlQuery(connection, "
+ SELECT
+ dimension1,
+ FROM
+ table1,
+ WHERE
+ dimension1 = df$number ")
I get the error [1] "42S22 904 [Oracle][ODBC][Ora]ORA-00904: \"df$number\": invalid identifier\n"

The gsubfn package can perform quasi-perl-style string interpolation. Preface any function with fn$ to turn it on for that function's arguments:
library(gsubfn)
num <- 3
dataframe <- fn$sqlQuery(connection,
"SELECT dimension1 FROM table1 WHERE dimension1 = $num ")
Alternately use paste or sprintf to construct the string:
sql <- paste("SELECT dimension1 FROM table1 WHERE dimension1 =", num)
sqlQuery(connection, sql)
or
sql <- sprintf("SELECT dimension1 FROM table1 WHERE dimension1 = %d", num)
sqlQuery(connection, sql)

dataframe <- sqlQuery(connection, paste("
+ SELECT
+ dimension1,
+ FROM
+ table1,
+ WHERE
+ dimension1 =", df$number))

The SQL statement is a string, so it cannot access your dataframe's value. To construct a string which uses the value use:
sql.base<-"SELECT dimension1 FROM table1 WHERE dimension1 = "
sql.completed<-paste0(sql.base,df$number)
sqlQuery(connection, sql.completed)
If you're looking to do a parameterized SQL query i.e. you have a number of rows in df, you can either collapse the field values with a comma seperator and put them in side brackets within a SQL IN query. Alternatively you can use the package RODBCext which will allow to do it like
sqlExecute(connection, "SELECT dimension1 FROM table1 WHERE dimension1 = ?", df$number, fetch = TRUE)
For more info on RODBCext and this type of SQL, check out their vignette

Related

Passing a vector of characters into another string in R

I would like to know how to pass a vector of text into a string within R.
I have a list of emails stored as a character vector:
all.emails <-
list(
c('email_1#emailaddress_1.com',
'email_2#emailaddress_2.com',
'email_3#emailaddress_3.com',
'email_r#emailaddress_n.com'
)
)
Also within R, I have some SQL code stored as a string that I will pass to our database via a database connection in R. To do this, I created a string that is the query written in SQL but I want to pass the emails above into the string below so I can query the database only for those emails.
The SQL query will look something like this:
sql <-
"
1> SELECT column_1, column_2,..., column_n
2> FROM name.of.table
3> WHERE toaddress = '[this is where to pass the email list above into]'.
"
It is line 3 where I need to pass my email list into.
Any help will be appreciated.
You can create the sql statement as follows:
sql = paste0(
"SELECT column_1, column_2,..., column_n ",
"FROM name.of.table ",
"WHERE toaddress IN ('",
paste0(unlist(all.emails),collapse="','"),
"')"
)
Output:
"SELECT column_1, column_2,..., column_n FROM name.of.table WHERE toaddress IN ('email_1#emailaddress_1.com','email_2#emailaddress_2.com','email_3#emailaddress_3.com','email_r#emailaddress_n.com')"
If you are going to generate a lot of SQL, I recommend that you look into glue_sql(), because it is especially designed for this use case and has a lot of nice features:
library(glue)
## set up example db:
con <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")
DBI::dbWriteTable(con, "chickwts", chickwts)
tbl <- "chickwts"
feed <- list("soybean", "sunflower")
feed <- glue_sql_collapse(single_quote(feed), sep = ", ")
glue_sql("
SELECT * FROM {`tbl`}
WHERE feed IN ({feed})
", .con = con)
#> <SQL> SELECT * FROM `chickwts`
#> WHERE feed IN ('soybean', 'sunflower')
Created on 2022-10-21 with reprex v2.0.2

Join a data frame (or other R object) to a table in a read-only Postgresql database?

I have read-only access to a Postgres database. I can not write to the database.
Q. Is there a way to construct and run a SQL query where I join a data frame (or other R object) to a table in a read-only Postgres database?
This is for accessing data from WRDS, https://wrds-www.wharton.upenn.edu/
Here's an attempt at pseudocode
#establish a connection to a database
con <- dbConnect( Postgres(),
host = 'host.org',
port = 1234,
dbname = 'db_name',
sslmode = 'require',
user = 'username', password = 'password')
#create an R dataframe (or other object)
df <- data.frame( customer_id = c('a123', 'a-345', 'b0') )
#write a sql query we will run
sql_query <- "
SELECT t.customer_id, t.* FROM df t
LEFT JOIN table_name df
on t.customer_id = df.customer_id
"
my_query_results <- dbSendQuery(con, sql_query)
temp <- dbFetch(res, n = 1)
dbClearResult(res)
my_query_results
Note and edit: The example query I provided is intentionally super simple for example purposes.
In my actual queries, there might be 3 or more columns I want to join on, and millions of rows I want to join on.
Use the copy_inline function from the dbplyr package, which was added following an issue filed on this topic. See also the question here.
An example of its use is found here.
If your join is on a single condition, it can be rewritten using an in clause:
In SQL:
SELECT customer_id
FROM table_name
WHERE customer_id in ('a123', 'a-345', 'b0')
Programmatically from R:
sql_query = sprintf(
"SELECT customer_id
FROM table_name
WHERE customer_id in (%s)",
paste(sQuote(df$customer_id, q = FALSE), collapse = ", ")
)

R, ClickHouse: Expected: FixedString(34). Got: UInt64: While processing

I am trying to query data from ClickHouse database from R with subset.
Here is the example
library(data.table)
library(RClickhouse)
library(DBI)
subset <- paste(traffic[,unique(IDs)][1:30], collapse = ',')
conClickHouse <- DBI::dbConnect('here is the connection')
DataX <- dbgetdbGetQuery(conClickHouse, paste0("select * from database
and IDs in (", subset ,") ", sep = "") )
As a result I get error:
DB::Exception: Type mismatch in IN or VALUES section. Expected: FixedString(34).
Got: UInt64: While processing (IDs IN ....
Any help is appreciated
Thanks to the comment of #DennyCrane,
"select * from database where toFixedString(IDs,34) in
(toFixedString(ID1, 34), toFixedString(ID2,34 ))"
This query subset properly
https://clickhouse.tech/docs/en/sql-reference/functions/#strong-typing
Strong Typing
In contrast to standard SQL, ClickHouse has strong typing. In other words, it doesn’t make implicit conversions between types. Each function works for a specific set of types. This means that sometimes you need to use type conversion functions.
https://clickhouse.tech/docs/en/sql-reference/functions/type-conversion-functions/#tofixedstrings-n
select * from (select 'x' B ) where B in (select toFixedString('x',1))
DB::Exception: Types of column 1 in section IN don't match: String on the left, FixedString(1) on the right.
use casting toString or toFixedString
select * from (select 'x' B ) where toFixedString(B,1) in (select toFixedString('x',1))

Bringing a R dataframe into a sql usable list with paste function

I have a dataframe/list in R(Rdataframe) which I want to use directly in RODBC query something like
Rdataframe= c('123456','234561','678912')
a= sqlQuery(connection, "Select * from table A where A.Id in Rdataframe")
And the query has to be like that, i.e. I can't pull the table first in R and then do the look up
So I think it can only run if it will come in the format such as
a= sqlQuery(connection, "Select * from table A where A.Id in ('123456','234561','678912'))
But despite of several attempts of sprintf & paste I am still not successful.
This is what I tried to attempt but failed
attempt1= sqlQuery(connection, sprintf("Select * from table A where A.Id in %s", Rdataframe))
attempt2=paste(Rdataframe, sep=",")
And then using this attempt 2 structure in query.
Every help is important
Rdataframe= c('123456' , '234561' , '678912')
df_str = paste(Rdataframe , collapse = "','" , sep=" ")
queryStr = paste("Select * from table A where A.Id in ('" ,df_str , "')" , sep="")
print(queryStr)
Gives the output
[1] "Select * from table A where A.Id in ('123456','234561','678912')"

Using variable in "IN" function of SQL query in R

I am having a variable x which contains 20000 IDs. I want to write a sql query like,
select * from tablename where ID in x;
I am trying to implement this in R where I can get the values only for IDs in x variable. The following is my try,
dbSendQuery(mydb, "select * from tablename where ID in ('$x') ")
I am not getting any error while trying this. But it is returning 0 values.
Next tried using
sprintf("select * from tablename where ID in %s",x)
But this is creating 20000 individual queries which could prove costly in DB.
Can anybody suggest me a way to write a command, which would loop through IDs in x and save to a Dataframe in R in a single query?
You need to have the codes in the actual string. Here is how I would do it with gsub
x <- LETTERS[1:3]
sql <- "select * from tablename where ID in X_ID_CODES "
x_codes <- paste0("('", paste(x, collapse="','"), "')")
sql <- gsub("X_ID_CODES", x_codes, sql)
# see new output
cat(sql)
select * from tablename where ID in ('A','B','C')
# then submit the query
#dbSendQuery(mydb, sql)
How about pasting it:
dbSendQuery(mydb, paste("select * from tablename where ID in (", paste(x, collapse = ","), ")"))

Resources