how to do batch insert using R package odbc function dbWriteTable() - r

I have been using dbWriteTable() to save dataframes in R into sql server database for a while.
Normally, I do as following:
library(odbc)
con = odbc::dbConnect(odbc(),
Driver = "SQL Server",
Server = server,
Database = "research",
Trusted_Connection = "True")
table = Id(schema = "schemaName", table = "tableName")
savingResult <- dbWriteTable(con, table, dataframeToSave,append=TRUE,overwrite = FALSE,batch_rows=nrow(dataframeToSave))
It works well until recently I am using the same code to save data into a view in the database. This view has a trigger with it. So when you do insert into this view. The trigger will do some checks and save the data into corresponding tables. Basically, I can treat this view as a table.
When I am doing so, my data team staff told me I am not doing a batch insert. I am inserting the data row by row. So if I am inserting a dataframe with 1000 rows, the trigger has been triggered 1000 times. It causes the saving process become very slow.
I thought dbWriteTable() is doing the batch insert all the time. I read from the internet, lost of posts also mentioning that dbWriteTable() is doing the batch insert. My data team staff asked me if this function has a parameter sth similar to FIRE_TRIGGERS. If so, it may solve the issue. However, it seems dbWriteTable() does not have any parameters similar to this one. Does anyone can confirm with if dbWriteTable() is doing a batch insert? If not, is there a way to do the batch insert?
Now I am working around this issue by using dbWriteTable() to write data into a temp table first, then insert the data from the temp table. In this way, the trigger will be only triggered once and the inserting speed is very fast.But I still want to know if there is an easier way to do this without using a temp table.
Thank you.

Related

How do you join a table in R with tables in SQL Server

I have a quick but tricky question.
What I'm trying to do:
I have an Excel spreadsheet that I upload in to R studio using
read_excel()
I have 2 tables I need to reference from a SQL server db
I want to only pull date from the SQL tables that match an ID from my
R table
My issues:
The tables I'm referencing from SQL are HUGE! I highly doubt I want
to pull this data into R in order to use it because even if R could
do this, it would take forever
I'm struggling with a solution to join cross platform data
Here is an example of some code that shows what I want to do, but I know doesn't work:
ExcelData <- read_excel("file_location.xlsx")
HospEnct <- sqlQuery(connClarity, "
Select H.PAT_ID, H.PAT_ENC_CSN_ID, H.CONTACT_DATE
From PAT_ENC_HSP AS H
INNER JOIN ExcelData AS E ON E.CSN = H.PAT_ENC_CSN_ID
")
I know I could exclude the join and just pull that data into R and work with the dataframe there, but the data I would pull would be too large. I thought I could define the SQL data by date which would work I think, but will still take more time than I'd like to load since there is so much data. Is there a way to do this all in one script, or will I have to pull the data I need from SQL in to R in order to work with the Excel data I pulled in to R? Sidenote: I don't have writing access to the SQL Server, only read, so I can't put the Excel in to SQL and work from there.

RJDBC: R to Oracle cannot DELETE or DROP TABLE

I'm using RJDBC to connect to a local database. This allows me to make SELECT queries easily using dbGetQuery, and CREATE TABLE using dbWriteTable.
However, I cannot figure out a method to DROP TABLE or DELETE or SELECT INTO directly from my R console. These things work when I do it directly in SQL Developer, but not when I pass the query onto the database from R.
How do I perform database record manipulations which are not SELECT statements using R?
I'd try using a different type instead.
dbGetQuery bases itself on finding and iterating over the DB rather than manipulating it's records.
Similar questions were asked before;
I couldn't find a nice R example, but if it helps, A nice java example could be found here:
EDIT:
I found the type I was talking about! Took me a while, anyhow - sqlQuery allows you to run pretty much any query, that is - a change in the DB records. Example I modified from this source:
res <- sqlQuery(con1,"DELETE TABLE TESTDATA", errors=FALSE)
# res will now hold the result of the query.
# -1 means error, otherwise iteration is sucessful, and it will hold the number of rows affected.
if (res == -1){ #if something messed up
cat ("An error has occurred.\n")
msg <- odbcGetErrMsg(con1) #Use your connection for this.
print (msg)
} else {
cat ("Table was deleted successfully.\n")
}
EDIT 2:
I got it confused with RODBC, however there's no reason to worry, since I found the RJDBC alternative as well! It's called, dbSendUpdate. Example:
# Assuming you have the connection saved as conn; these example shows how to use dbSendUpdate to create tables and insert values.
# You could use it with every non-selective query, that is, which manipulates the record (update,delete,insert,drop etc.)
# create table, with dbSendUpdate:
dbSendUpdate(conn, "CREATE TABLE foo(a INT,b VARCHAR(100))")
# insert value, bind parameters to placeholders in statement:
dbSendUpdate(conn, "INSERT INTO foo VALUES(?,?)", 42, "bar")
# feel free to modify the query itself, these are just example values.
this is similar to another answered question here
basically dbGetQuery() as it name implies is used to send queries and recive their result.
if you want to send a general statement to the db like 'drop table' etc.
you can use:
dbSendUpdate(connection_object, "drop table table_name")

Dynamic Sql VS Temporary Tables

I have a stored procedure, in which temporary tables "on delete preserve rows" are created dynamically and data was inserted, when i try to execute any other dynamic sql statement, data in the temporary tables are deleted. But I need a data for further process.
Can any one tell me why data is losing, and what is the solution for this.
Thank you.
Three possible reasons for this:
There is an explicit commit.
There is an implicit commit (a DDL statement, typically).
You are are closing the session and starting a new one.
If you cannot avoid these then you'll have to create a permanent table.

Selecting a specific table in microsoft access 2007 with R code

I have an Access database with a couple of tables and I want to work in just one of them. I am using library RODBC. Let's say the table that I want to work it called dtsample. And my Access database is called database.accdb
Here is my code:
library(RODBC)
dataconnect <- odbcConnectAccess2007("database.accdb")
data <- sqlQuery(dataconnect,"SELECT*dtsample columb1, columb2...)
but it does not work. How can I define the table in Access that I want to work with?
Your solution is not really one, because you just got around learning about SELECT
data <- sqlQuery(dataconnect, "SELECT * from dtsample where Columb1 = 'a' or Columb1 ='b')
My suggestion you are not fluent in SQL: Use the query designer in Access, and when it works, get the SQL code generated (View:SQL) and paste it into R.

Inserting a dataset into a database table

I have table on a database on a server, that is identical to a table on my local. I have click once application that needs to download the version of records on the server down to my local.
At the moment i have webservice that pulls back the records on the server in batches, using asp.net datasets as containers. How do i commit the whole dataset to the table in my local? The table on my local is empty.
Cheers in advance!
If you already have a DataSet, containing one or several DataTables, why don't you just use the SqlDataAdapter and call its ".Update()" method with your DataSet?
In the SqlDataAdapter, you can define an InsertCommand, an UpdateCommand, a DeleteCommand which will take care of the three basic insert/update/delete statements for your rows. All you need to do is define / write those three SQL Statements once, and the SqlDataAdapter will do the rest for you (looping through the rows, figuring out whether to insert, update or delete etc.).
If you want, you can even use your basic SELECT statement from the SelectCommand in your DataSet and use the SqlCommandBuilder to build the INSERT, UPDATE and DELETE statements based on your SELECT.
MSDN Library doc on SqlDataAdapter
SQL Data Adapter without SqlCommandBuilder
MSDN Library doc on SqlCommandBuilder
Marc
There are several options. Here are the first two that come to mind.
One would be to loop through the DataTable and build an SQL Insert statement on each loop and then execute the Insert statement against the local.
Another would be to use the SQL Bulk Copy to insert the data

Resources