Writing a Teradata With Statement in RODBC in R - r

I have connected Teradata to my R session with RODBC.
Typically I use data <- sqlQuery(conn, "SELECT statement") however when I put the following WITH statement in place of the SELECT statement, there is an error.
data <- sqlQuery(conn,
"WITH zzz as (SELECT statement1),
yyy as (SELECT statement2)
SELECT statement3"

try correcting mismatched " and ) as below...
data <- sqlQuery(conn,
"WITH zzz as (SELECT statement1),
yyy as (SELECT statement2)
SELECT statement3")

Related

Join a data frame (or other R object) to a table in a read-only Postgresql database?

I have read-only access to a Postgres database. I can not write to the database.
Q. Is there a way to construct and run a SQL query where I join a data frame (or other R object) to a table in a read-only Postgres database?
This is for accessing data from WRDS, https://wrds-www.wharton.upenn.edu/
Here's an attempt at pseudocode
#establish a connection to a database
con <- dbConnect( Postgres(),
host = 'host.org',
port = 1234,
dbname = 'db_name',
sslmode = 'require',
user = 'username', password = 'password')
#create an R dataframe (or other object)
df <- data.frame( customer_id = c('a123', 'a-345', 'b0') )
#write a sql query we will run
sql_query <- "
SELECT t.customer_id, t.* FROM df t
LEFT JOIN table_name df
on t.customer_id = df.customer_id
"
my_query_results <- dbSendQuery(con, sql_query)
temp <- dbFetch(res, n = 1)
dbClearResult(res)
my_query_results
Note and edit: The example query I provided is intentionally super simple for example purposes.
In my actual queries, there might be 3 or more columns I want to join on, and millions of rows I want to join on.
Use the copy_inline function from the dbplyr package, which was added following an issue filed on this topic. See also the question here.
An example of its use is found here.
If your join is on a single condition, it can be rewritten using an in clause:
In SQL:
SELECT customer_id
FROM table_name
WHERE customer_id in ('a123', 'a-345', 'b0')
Programmatically from R:
sql_query = sprintf(
"SELECT customer_id
FROM table_name
WHERE customer_id in (%s)",
paste(sQuote(df$customer_id, q = FALSE), collapse = ", ")
)

SQLite Troubleshooting a missing table in R

Looking to Inner join the results from one query with another in the SQLite Package in R. So far I have the following code:
library(SQLite)
sql<-
"SELECT
telecast.telecast_id,
telecast.series_name,
affiliates.network_name
FROM telecast a
JOIN affiliates b
ON a.network_id = b.network_id limit 10;"
res <- dbSendQuery(con, sql)
df_ti <- dbFetch(res)
print(df_ti)
When I run the code, it says there is no table found but when I run just
SELECT telecast.telecast_id FROM telecast
The information shows up. I am not too good with troubleshooting in SQLite in R, any thoughts on how to fix this??
You create aliasses for your tables.
If you change your select query to use these, I had it working.
library(SQLite)
sql<-
"SELECT
a.telecast_id,
a.series_name,
b.network_name
FROM telecast a
JOIN affiliates b
ON a.network_id = b.network_id limit 10;"
res <- dbSendQuery(con, sql)
df_ti <- dbFetch(res)
print(df_ti)
I was able to figure it out. Your suggestion of creating aliases for the tables was spot on. I just declared each table an alias using AS. The code I was able to come up with is below, thanks for the help!
SELECT
t.series_name,
ti.num_views_per_telecast
FROM
(SELECT
ti.telecast_id,
ti.network_id,
count(*) as num_views_per_telecast
FROM
tunein AS ti
INNER JOIN
affiliates AS a
ON
ti.network_id = a.network_id
WHERE
ti.dvr_time_shift = 'L+SD' and
a.network_name = 'ABC'
group by
ti.telecast_id,
ti.network_id
)ti
inner join telecast AS t
On
t.telecast_id = ti.telecast_id
ORDER BY ti.num_views_per_telecast DESC

R - RMysql - could not run statement: memory exhausted

I have R script for data analysis. I try it on 6 different tables from my mysql database. On 5 of them script works fine. But on last table it don't wont work. There is part of my code :
sql <- ""
#write union select for just one access to database which will optimize code
for (i in 2:length(awq)-1){
num <- awq[i]-1
sql <- paste(sql, "(SELECT * FROM mytable LIMIT ", num, ",1) UNION ")
}
sql <- paste(sql, "(SELECT * FROM mytable LIMIT ", awq[length(awq)-1], ",1)")
#database query
nb <- dbGetQuery(mydb, sql)
My mysql table where script don't work have 21 676 rows. My other tables have under 20 000 rows and with them script work. If it don't work work it give me this error :
Error in .local(conn, statement, ...) :
could not run statement: memory exhausted near '1) UNION (SELECT * FROM mytable LIMIT 14107 ,1) UNION (SELECT * FROM mytabl' at line 1
I understood there is memory problem. But how to solve it ? I don't want delete rows from my table. Is there another way ?

Bringing a R dataframe into a sql usable list with paste function

I have a dataframe/list in R(Rdataframe) which I want to use directly in RODBC query something like
Rdataframe= c('123456','234561','678912')
a= sqlQuery(connection, "Select * from table A where A.Id in Rdataframe")
And the query has to be like that, i.e. I can't pull the table first in R and then do the look up
So I think it can only run if it will come in the format such as
a= sqlQuery(connection, "Select * from table A where A.Id in ('123456','234561','678912'))
But despite of several attempts of sprintf & paste I am still not successful.
This is what I tried to attempt but failed
attempt1= sqlQuery(connection, sprintf("Select * from table A where A.Id in %s", Rdataframe))
attempt2=paste(Rdataframe, sep=",")
And then using this attempt 2 structure in query.
Every help is important
Rdataframe= c('123456' , '234561' , '678912')
df_str = paste(Rdataframe , collapse = "','" , sep=" ")
queryStr = paste("Select * from table A where A.Id in ('" ,df_str , "')" , sep="")
print(queryStr)
Gives the output
[1] "Select * from table A where A.Id in ('123456','234561','678912')"

Using variable in "IN" function of SQL query in R

I am having a variable x which contains 20000 IDs. I want to write a sql query like,
select * from tablename where ID in x;
I am trying to implement this in R where I can get the values only for IDs in x variable. The following is my try,
dbSendQuery(mydb, "select * from tablename where ID in ('$x') ")
I am not getting any error while trying this. But it is returning 0 values.
Next tried using
sprintf("select * from tablename where ID in %s",x)
But this is creating 20000 individual queries which could prove costly in DB.
Can anybody suggest me a way to write a command, which would loop through IDs in x and save to a Dataframe in R in a single query?
You need to have the codes in the actual string. Here is how I would do it with gsub
x <- LETTERS[1:3]
sql <- "select * from tablename where ID in X_ID_CODES "
x_codes <- paste0("('", paste(x, collapse="','"), "')")
sql <- gsub("X_ID_CODES", x_codes, sql)
# see new output
cat(sql)
select * from tablename where ID in ('A','B','C')
# then submit the query
#dbSendQuery(mydb, sql)
How about pasting it:
dbSendQuery(mydb, paste("select * from tablename where ID in (", paste(x, collapse = ","), ")"))

Resources