How can I actually see the raw query generated by DBI::dbWriteTable? - r

I am wondering if there is a way to retrieve the SQL query which dbWriteTable sends to the DBMS. For example, for the following example. Is there a way to get the query?
library(DBI)
con <- dbConnect(RSQLite::SQLite(), ":memory:")
dbWriteTable(con, "mtcars", mtcars[1:10, ])
dbDisconnect(con)
EDIT: 2016-11-11
As a Postgres user, I would be most interested in the commands sent using RPostegreSQL. Following #krlmlr, I figured that there is a function postgresqlCopyInDataframewhich itself calls the C function RS_PostgreSQL_CopyInDataframe. So I hit a dead-end here as C is beyond my skills. Any ideas welcome...

The development version of RSQLite, which will soon hit CRAN, uses DBI::sqlCreateTable(). This function returns SQL that creates a particular table:
con <- dbConnect(RSQLite::SQLite(), ":memory:")
sqlCreateTable(con, "mtcars", mtcars[1:10, ])
dbDisconnect(con)
For other drivers you could look up the method definition in the driver's source code, e.g., via showMethods("dbWriteTable", includeDefs = TRUE) and proceed from there.

Related

RStudio - ODBC connection with SQL query taken from sql file

I have my ODBC query to connect to Teradata and just wonder if I can read in the SQL file as oppose to have SQL code in? I am trying to find R function for Python's pd.read_sql_query(f, con) where f is my SQL file with code.
So for my connection, it would change from:
con <- function(){
query <- paste0("
SELECT * FROM table1
")
print(queryData(query))
}
con<- data.frame(con())
to
con <- function(){
query <- "SQL_code.sql"
print(queryData(query))
}
con<- data.frame(con())
read your sql from a file:
sql_query <- read.delim('/path/SQL_code.sql', header = FALSE) %>% as.character()
then define the connection and use it:
library(DBI)
db <- dbConnect(...)
dbGetQuery(db, sql_query)
If I understand your question correctly, you could try something like this?
library(DBI)
library(readr)
df <- dbGetQuery(con, statement = read_file('SQL_code.sql'))
# con is your connection
If it does not solve your problem, there may be some solutions here: How to read the contents of an .sql file into an R script to run a query?

Reading Data from a SQL server in RStudio (dplyr)

I am sure this question is very basic, but this is the first time I am using R connected to a server, so a few things still confuse me.
I used ODBC Data Sources on Windows to create a DNS, and used
con <- dbConnect(odbc::odbc(), "TEST_SERVER")
this worked, and now under the connection tab I can see the server, and if I double click I can see the databases and tables that exist in the server. How would I go about reading something inside one of those databases?
For Example, if the database name is db1, and the table name is t1, what is the code needed to read that table into local memory? I would prefer using dbplyr as I am familiar with the syntax. I am just unsure how to refer to a particular database and table after making the connection to the server.
I haven't used dbplyr before, but you can query the database using dbGetQuery.
test <- dbGetQuery(
con,
"SELECT *
FROM db1.t1
"
)
You can also pass the database into the connection string.
con <- dbConnect(
drv = odbc(),
dsn = "TEST_SERVER",
database = "db1"
)
And then your query would just be "SELECT * FROM t1".
EDIT: To query the table using dbplyr:
tbl1 <- tbl(con, "t1")
qry <- tbl1 %>% head() %>% collect()
I like to use RODBC-
con <- RODBC::odbcConnect(dsn = 'your_dsn',
uid = 'userid',
pwd = 'password')
table_output <- RODBC::sqlQuery(con, 'SELECT * FROM Table')

How to import large Database Table into R

I'm trying to import a large Database table into R to do some global analysis.
I connect to Oracle DB with ROracle and use dbGetquery.
Make minimum selection and necessary where clauses directly in the query to reduce the scope of the dataset but still it is 40 columns for 12 million of rows.
My PC has only 8GB of RAM how can I handle this?
There is no way to store those data on the disk rather than on the RAM ? or something similar to that way?
The same things made in SAS works fine.
Any Idea?
Few ideas:
May be some aggregation could be done on server side?
You are going to do something with this data in R, right? So you can try not to load data, but to create tbl object and made manipulations and aggregations in R
library(dplyr)
my_tbl <- 'SELECT ... FROM ...' %>% sql() %>% tbl(con, .)
where con is your connection
Here are a couple ideas for you to consider.
library(RODBC)
dbconnection <- odbcDriverConnect("Driver=ODBC Driver 11 for SQL Server;Server=Server_Name; Database=DB_Name;Uid=; Pwd=; trusted_connection=yes")
initdata <- sqlQuery(dbconnection,paste("select * from MyTable Where Name = 'Asher';"))
odbcClose(channel)
If you can export the table as a CSV file...
require(sqldf)
df <- read.csv.sql("C:\\your_path\\CSV1.csv", "select * from file where Name='Asher'")
df

syntax for database.table in dbplyr?

I have a connection to our database:
con <- dbConnect(odbc::odbc(), "myHive")
I know this is successful because when I run it, in the top right of RStudio I can see all of our databases and tables.
My question is, how can I select a specific database table combination? The documentation shows a user sleecting a single table, "flights" but I need to do the equivilent of somedatabase.sometable.
Tried:
mytable <- tbl(con, "somedb.sometable")
Error in new_result(connection#ptr, statement) :
nanodbc/nanodbc.cpp:1344: 42S02: [Hortonworks][SQLEngine] (31740) Table or view not found: HIVE..dp_enterprise.uds_order
Then tried:
mytable <- tbl(con, "somedb::sometable")
Error in new_result(connection#ptr, statement) :
nanodbc/nanodbc.cpp:1344: 42S02: [Hortonworks][SQLEngine] (31740) Table or view not found: HIVE..somedb::sometable
I tried removing the quotes "" too.
Within the connections pane of RStudio I can see somedb.sometable. It's there! How can I save it to variable mytable?
You select the database when creating the connection and the table when creating the tbl (with the from argument).
There is no standard interface to dbConnect, so the exact way to pass the database name depends on the DBDriver you use. Indeed DBI::dbConnect is simply a generic dispatching to the driver-specific dbConnect.
In your case, the driver is odbc so you can check out the documentation for odbc::dbConnect and you'll see the relevant argument is database.
This will work:
con <- dbConnect(odbc::odbc(), "myHive", database = "somedb")
df <- tbl(con, from = "sometable")
With most other drivers (e.g. RMariaDB, RMySQL, RPostgres, RSQLite), the argument is called dbname, so you'd do this:
con <- dbConnect(RMariaDB::MariaDB(), dbname = "somedb")
df <- tbl(con, from = "sometable")
I think I found it, use in_schema
mytable <- tbl(con, in_schema("somedb", "sometable"))
This returns a list not a tbl though so I'm not sure.

R PostgreSQL - data is not getting updated in the table of the database

Hi I am trying to update a postgresql table using RpostgreSQL package, the commands in R are executed successfully but the new data is not getting reflected in the database. Below are the commands i have executed in R
for(i in new_data$FIPS) {
drv <- dbDriver("PostgreSQL")
con <- dbConnect(drv, dbname="ip_platform", host="******", port="5432", user="data_loader", password="******")
txt <- paste("UPDATE adminvectors.county SET attributes= hstore('usco#TP-TotPop#2010'::TEXT,",new_data$TP.TotPop[new_data$FIPS == i],"::TEXT) where geoid ='",i,"'")
dbGetQuery(con, txt)
dbCommit(con)
dbDisconnect(con)
}
Can anyone let me know if I have done something wrong? Any help is highly appreciated
Simplify, simplify, simplify -- the RPostgreSQL has had unit tests for these types of operations since the very beginning (in 2008 no less) and this works (unless you have database setup issues).
See eg here in the GitHub repo for all the tests.
You are calling dbGetQuery instead of dbSendQuery and also disconnecting from the database in your for loop. You also are creating a new connection object for every loop iteration, which is not necessary. Try this:
drv <- dbDriver("PostgreSQL")
con <- dbConnect(drv, dbname="ip_platform", host="******", port="5432", user="data_loader", password="******")
for(i in new_data$FIPS) {
txt <- paste("UPDATE adminvectors.county SET attributes= hstore('usco#TP-TotPop#2010'::TEXT,",new_data$TP.TotPop[new_data$FIPS == i],"::TEXT) where geoid ='",i,"'")
dbSendQuery(con, txt)
}
dbDisconnect(con)
You shouldn't call dbCommit(con) explicitly. The enclosing transaction will always be commited when dbSendQuery returns, exactly as when you do an UPDATE with pure SQL. You don't call COMMIT unless you have created a new transaction with BEGIN TRANSACTION.
The warning "there is no transaction in progress" is PostgreSQL's way of telling you that you have issued a COMMIT statement without having issued a BEGIN TRANSACTION statement which is exactly what you are doing in your function.
Thanks for all your inputs.
The issue is with the paste() function which I used in the for loop. The paste() function has replaced the comma with a space in the query as a result the where condition is failing. I have added a sep="" attribute in the paste() and the query is now properly sent to the database and the rows are getting updated as expected.

Resources