R Loop from df to search in PostgreSQL - r

PostgreSQL table structure looks like
if (!dbExistsTable(pg, c("store", "a"))) {
dbGetQuery(pg, "
CREATE TABLE a (
company_name text,
num text,
date date,
file text
)
")
}
and R dataframe is
df <- data.frame(num=c('13','15', '100', '700'))
what i want is to search df column num (all rows) from PostgreSQL something like
file.list <- dbGetQuery(pg, "
SET work_mem='1GB';
SELECT *
FROM store.a
WHERE num = ?????
")
don't know how to create where clause for the loop to get the desired results and is it possible to fixed the results for specific date/year?

Related

How to read data from a database by chunk in R?

In dplyr, if tbl is a table in a database then head(tbl) gets translated into
select
*
from
tbl
limit 6
but there doesn't seem to be a way to use the offset keyword to read data in chunks. E.g. the equivalent of
select
*
from
tbl
limit 6 offset 5
doesn't seem possible with dplyr. In dbplyr, there is a do function to let you choose a chunk_size to bring back data chunk-by-chunk.
Is that the only way to do it in R? The solution doesn't have to in dplyr or the tidyverse.
Another approach would be to construct your own offset function. This assumes your database supports it, and the function is unlikely to be transferable to databases of other types.
Something like the following:
offset_head = function(table, num, offset){
# get connection
db_connection = table$src$con
sql_query = build_sql(con = db_connection,
sql_render(table),
"\nLIMIT ", num,
"\nOFFSET ", offset
)
return(tbl(db_connection, sql(sql_query)))
}
The way I have done this in dbplyr is based on the addition of a reference/ID column:
my_tbl = tbl(con, "table_name")
for(i in 1:100){
sub_tbl = my_tbl %>% filter(ID %% 100 == i)
# further processing using 'sub_tbl'
...
}
If you add a row number to your dataset, then your filter could be replaced by filter(LowerBound < row_number & row_number < UpperBound).

Transform list inside a column to data.table columns in r

I have a data.table object where each row in the column "status" is filled with a list. How can I create a new data.table transforming its lists to columns:
Columns "as is":
status, done_ratio, created_on, updated_on, closed_on
Columns "to be":
status_id, status_name, done_ratio, created_on, updated_on, closed_on
Code used:
basic_data <- tickets[, .(status,done_ratio,created_on,updated_on,closed_on)]
Result of view(basic_data): http://diegoquirino.pro.br/printdetela/print_duvida_datatablelistcolumn.png
I'd like to have NOT a list, but ordinary columns: status_id, status_name
Thankfully for help.
The following code attempts to the purpose (considering "tickets", object from picture):
# Collecting basic data
basic_data <- tickets[, .(done_ratio,created_on,updated_on,closed_on)]
status_data <- data.frame("status_id" = integer(),"status_name" = character())
# Fetching Status (each list)
for(status in tickets[,status]){
status <- as.data.frame(rbind(status))
names(status) <- c("status_id","status_name")
status_data <- rbind(status_data,status)
}
status_data <- cbind(status_data,basic_data)

Use values in df column to create a sql query

I would like to take the values from a data frame and paste them into a text string that can be used as a sql query. In SAS I would do it
proc sql noprint; Select Names into :names separated by ", " from df; quit;
this would create a variable &name storing all the names. Like: Id, Name, Account. I would like to do this same type of thing in R, but do not know how. I can create a vector with names separated by comma and each one is surrounded by quotes, I can take away the quotes using noquote function and have them in a vector, but I can not get the elements in another paste statement to add the "SELECT" and FROM. I can not get it to all paste. Is there a way to pull the values on Column and create a text string that can be used as a SQL query inside R? Here is what I have tried in R:
name = c("Id", "IsDeleted", "Name", "Credit__Loan__c")
label = c("Record Id", "Deleted", "ID", "Loan")
df = data.frame(name, label)
names(df) <- c("name", "label")
as.query.fields = noquote(paste(df$name, collaspe=", "))
as.query.final <- paste("SELECT " , noquote(paste(df$name, collaspe=", ")), " id FROM Credit_Amortization_Schedule__c")
data(iris)
colnames(iris)
a <- noquote(paste(colnames(iris), collaspe=", "))
as.query.final <- cat("SELECT " , a, " id FROM Credit_Amortization_Schedule__c")
The result is:
SELECT Sepal.Length , Sepal.Width , Petal.Length , Petal.Width , Species , id FROM Credit_Amortization_Schedule__c
which you can then use with SQL like this:
require(RODBC)
result <- sqlQuery(db, as.query.final)
where db is your database connection
Or, since I see your sqldf tag now, if you want to use sqldf it's just:
sqldf(as.query.final)
The gsubfn package supports string interpolation:
library(gsubfn)
Names <- toString( sprintf("%s '%s'", df$name, df$label) )
fn$identity("select $Names from myTable")
giving:
[1] "select Id 'Record Id', IsDeleted 'Deleted', Name 'ID', Credit__Loan__c 'Loan' from myTable"
Here some additional examples: SO example 1 and SO example 2 .

Insert in RMySQL from data frame

Im trying to add data to MySQL table by using RMySQL. I only need to add one row at a time and it's not working. What I'm trying to do is this.
dbGetQuery(con,"INSERT INTO names VALUES(data[1,1], data[1,2])")
so what I'm doing is that I have values in data frame that is named as "data" and I need to put them into mysql table. before that I will check them if they are already in the table or not and if they are not then I will add them, but that way it isn't working. The data is read from .csv file by read.csv .
You can use paste to construct that actual query.
dat <- matrix(1:4, 2, 2)
query <- paste("INSERT INTO names VALUES(",data[1,1], ",", data[1,2], ")")
query
#[1] "INSERT INTO names VALUES( 1 , 3 )"
dbGetQuery(con, query)
# If there are a lot of columns this could be tedious...
# So we could also use paste to add all the values at once.
query <- paste("INSERT INTO names VALUES(", paste(data[1,], collapse = ", "), ")")
query
#[1] "INSERT INTO names VALUES( 1, 3 )"
You could try with:
dbWriteTable(names, data[1,],append=True)
as the DBI package details

Pass R variable to RODBC's sqlQuery with multiple entries?

I'm in the process of learning R, to wave SAS goodbye, I'm still new to this and I somehow have difficulties finding exactly what I'm looking for.
But for this specific case, I read:
Pass R variable to RODBC's sqlQuery?
and made it work for myself, as long as I'm only inserting one variable in the destination table.
Here is my code:
library(RODBC)
channel <- odbcConnect("test")
b <- sqlQuery(channel,
"select top 1 Noinscr
FROM table
where PrixVente > 100
order by datevente desc")
sqlQuery(channel,
paste("insert into TestTable (UniqueID) Values (",b,")", sep = "")
When I replace the top 1 by any other number, let's say top 2, and run the exact same code, I get the following errors:
[1] "42000 195 [Microsoft][SQL Server Native Client 10.0][SQL Server]
'c' is not a recognized built-in function name."
[2] "[RODBC] ERROR: Could not SQLExecDirect
'insert into TestTable (UniqueID) Values (c(8535735, 8449336))'"
I understand that it is because there is an extra c that is generated, I assume for column when I give the command: paste(b).
So how can I get "8535735, 8449336" instead of "c(8535735, 8449336)" when using paste(b)? Or is there another way to do this?
Look into the collapse argument in the paste() documentation. Try replacing b with paste(b, collapse = ", "), as shown below.
Edit As Joshua points out, sqlQuery returns a data.frame, not a vector. So, instead of paste(b, collapse = ", "), you could use paste(b[[1]], collapse = ", ").
library(RODBC)
channel <- odbcConnect("test")
b <- sqlQuery(channel,
"select top 1 Noinscr
FROM table
where PrixVente > 100
order by datevente desc")
sqlQuery(channel,
## note paste(b[[1]], collapse = ", ") in line below
paste("insert into TestTable (UniqueID) Values (", paste(b[[1]], collapse = ", "),")", sep = "")
Assuming b looks like this:
b <- data.frame(Noinscr=c("8535735", "8449336"))
Then you only need a couple steps:
# in case Noinscr is a factor
b$Noinscr <- as.character(b$Noinscr)
# convert the vector into a single string
# NOTE that I subset to get the vector, since b is a data.frame
B <- paste(b$Noinscr, collapse=",")
# create your query
paste("insert into TestTable (UniqueID) Values (",B,")", sep="")
# [1] "insert into TestTable (UniqueID) Values (8535735,8449336)"
You got odd results because sqlQuery returns a data.frame, not a vector. As you learned, using paste on a data.frame (or any list) can provide weird results because paste must return a character vector.

Resources