I would like to run some queries with RStudio using the RODBC library. Normally, code like this works fine:
query_6 <- sqlQuery(con, "Select * from my_table where condition = more_than_sth")
I prefer to have some variable which will be defined by me before and stay for more_than_sth. Lets says it is x. Is there any method which would make me able to put this variable into the query string? Should I use some kind of paste, maybe before, or put it in directly?
Regards,
RafaĆ
Concatenate function in R is paste, it automatically append a whitespace between each object, you can remove them by using paste(..., sep = "") or paste0().
more_than_sth <- "x"
query_6 <- sqlQuery(con, paste0("Select * from my_table where condition ='", more_than_sth, "'"))
Related
I am working on a project where I have to download more than 10 million records on a relatively small server. So instead of just downloading the entire dataset, I have to download it in smaller sections. I am trying to create a loop that will call batches of the data based on date. I'm used to coding in Stata where you can call a local by using `x' or some variant within a string. However, I can't find a way to do this in R. Below is a small piece of the code I'm using. Basically, whenever I try to run this 'val' and 'val2' aren't updating with the dates in the defined lists so the output literally just reads as if the server is trying to search between 'val' and 'val2' instead of between '20190101' and '20190301'. Any suggestions for how to fix this are greatly appreciated!
x<-c(20190101, 20190301)
y<-c(20190301, 20190501)
foreach (val=x, val2=y) %do% {
data<-DBI::dbGetQuery(myconn, "SELECT * FROM .... WHERE (DATE BETWEEN 'val' AND 'val2')")
}
With a basic loop
x<-c(20190101, 20190301)
y<-c(20190301, 20190501)
data_all = c()
for(i in 1:length(x)){
query = paste0("SELECT * FROM .... WHERE (DATE BETWEEN '",
x[i], "' AND '", y[i], "')")
data <- DBI::dbGetQuery(myconn, query)
data_all = rbind(data_all, data)
}
With sprintf you can construct the query and use lapply + do.call to combine the results into one dataframe.
x<-c(20190101, 20190301)
y<-c(20190301, 20190501)
input <- sprintf("SELECT * FROM .... WHERE (DATE BETWEEN '%s' AND '%s')", x, y)
result <- do.call(rbind, lapply(input, function(x) DBI::dbGetQuery(myconn, x)))
Using purrr::map_df is a bit shorter.
result <- purrr::map_df(input, ~DBI::dbGetQuery(myconn, .x))
I have the following string
wrows <- "c(1:2,4)"
(which is read from a database, therefore not just c(1:2,4)).
I want to use this in an expression in the function read.xslx (openxlsx) like this
read.xlsx(inp_file,
sheetn,
rows = wrows,
cols = c(1,2))
)
How can I use wrows such that read.xlsx reads c(1:2,4) instead of "c(1:2,4)"?
Thanks
Renger
I am very new to R, so please forgive any obvious or naive errors. I need to insert multiple rows of data from R into an Oracle database table.
Make the data frame (I have made the RJDBC connection earlier in the script):
df <- data.frame("field_1" = 1:2, "field_2" = c("f","k"), "field_3"= c("j","t"))
This code runs without error, but inserts only the first row into the table:
insert <- sprintf("insert into temp_r_test_u_suck values (%s')",
apply(df, 1, function(i) gsub(" ", "", paste("'", i, collapse="',"), fixed = TRUE)))
dbSendUpdate(con, insert)
This code runs:
insert <- sprintf("into temp_r_test_u_suck values (%s')",
apply(df, 1, function(i) gsub(" ", "", paste("'", i, collapse="',"), fixed = TRUE)))
insert_all <- c("insert all", insert, "select * from dual")
dbSendUpdate(con, insert_all)
But gives me this error:
Error in .local(conn, statement, ...) :
execute JDBC update query failed in dbSendUpdate (ORA-00905: missing keyword
Both of the queries work on their own in Oracle. WHAT am I doing wrong?
Thank you!
Multiple SQL statements are not supported in dbGetQuery, dbSendQuery, dbSendUpdate calls. You need to iterate through them for each statement. Hence, why only the first statement processes. To resolve, extend the anonymous function inside apply to call dbSendUpdate:
apply(df, 1, function(i) {
# BUILD SQL STATEMENT
insert <- sprintf("insert into temp_r_test_u_suck values (%s')",
paste0("'", i, collapse="',"))
# RUN QUERY
dbSendUpdate(con, insert)
})
However, RJDBC extends the DBI standard by supporting parameterization with dbSendUpdate as mentioned in rForge docs for bulk-inserts with no need for iteratively concatenating strings.
dbSendUpdate(conn, statement, ...) This function is analogous to
dbSendQuery, but works with DBML statements and thus doesn't return a
result set. It is more efficient than dbSendQuery. In addition, as of
RJDBC 0.2-9 it supports vectors in prepared statements which allows
bulk-inserts.
# ALL CHARACTER DATAFRAME
df <- data.frame(field_1=as.character(1:2), field_2=c("f","k"), field_3=c("j","t"),
stringsAsFactors=FALSE)
# PREPARED STATEMENT
sql <- "insert into temp_r_test_u_suck values (?, ?, ?)"
# RUN QUERY
dbSendUpdate(con, sql, df$field_1, df$field_2, df$field_3)
Im trying to add data to MySQL table by using RMySQL. I only need to add one row at a time and it's not working. What I'm trying to do is this.
dbGetQuery(con,"INSERT INTO names VALUES(data[1,1], data[1,2])")
so what I'm doing is that I have values in data frame that is named as "data" and I need to put them into mysql table. before that I will check them if they are already in the table or not and if they are not then I will add them, but that way it isn't working. The data is read from .csv file by read.csv .
You can use paste to construct that actual query.
dat <- matrix(1:4, 2, 2)
query <- paste("INSERT INTO names VALUES(",data[1,1], ",", data[1,2], ")")
query
#[1] "INSERT INTO names VALUES( 1 , 3 )"
dbGetQuery(con, query)
# If there are a lot of columns this could be tedious...
# So we could also use paste to add all the values at once.
query <- paste("INSERT INTO names VALUES(", paste(data[1,], collapse = ", "), ")")
query
#[1] "INSERT INTO names VALUES( 1, 3 )"
You could try with:
dbWriteTable(names, data[1,],append=True)
as the DBI package details
I'm in the process of learning R, to wave SAS goodbye, I'm still new to this and I somehow have difficulties finding exactly what I'm looking for.
But for this specific case, I read:
Pass R variable to RODBC's sqlQuery?
and made it work for myself, as long as I'm only inserting one variable in the destination table.
Here is my code:
library(RODBC)
channel <- odbcConnect("test")
b <- sqlQuery(channel,
"select top 1 Noinscr
FROM table
where PrixVente > 100
order by datevente desc")
sqlQuery(channel,
paste("insert into TestTable (UniqueID) Values (",b,")", sep = "")
When I replace the top 1 by any other number, let's say top 2, and run the exact same code, I get the following errors:
[1] "42000 195 [Microsoft][SQL Server Native Client 10.0][SQL Server]
'c' is not a recognized built-in function name."
[2] "[RODBC] ERROR: Could not SQLExecDirect
'insert into TestTable (UniqueID) Values (c(8535735, 8449336))'"
I understand that it is because there is an extra c that is generated, I assume for column when I give the command: paste(b).
So how can I get "8535735, 8449336" instead of "c(8535735, 8449336)" when using paste(b)? Or is there another way to do this?
Look into the collapse argument in the paste() documentation. Try replacing b with paste(b, collapse = ", "), as shown below.
Edit As Joshua points out, sqlQuery returns a data.frame, not a vector. So, instead of paste(b, collapse = ", "), you could use paste(b[[1]], collapse = ", ").
library(RODBC)
channel <- odbcConnect("test")
b <- sqlQuery(channel,
"select top 1 Noinscr
FROM table
where PrixVente > 100
order by datevente desc")
sqlQuery(channel,
## note paste(b[[1]], collapse = ", ") in line below
paste("insert into TestTable (UniqueID) Values (", paste(b[[1]], collapse = ", "),")", sep = "")
Assuming b looks like this:
b <- data.frame(Noinscr=c("8535735", "8449336"))
Then you only need a couple steps:
# in case Noinscr is a factor
b$Noinscr <- as.character(b$Noinscr)
# convert the vector into a single string
# NOTE that I subset to get the vector, since b is a data.frame
B <- paste(b$Noinscr, collapse=",")
# create your query
paste("insert into TestTable (UniqueID) Values (",B,")", sep="")
# [1] "insert into TestTable (UniqueID) Values (8535735,8449336)"
You got odd results because sqlQuery returns a data.frame, not a vector. As you learned, using paste on a data.frame (or any list) can provide weird results because paste must return a character vector.