RStudio - ODBC connection with SQL query taken from sql file - r

I have my ODBC query to connect to Teradata and just wonder if I can read in the SQL file as oppose to have SQL code in? I am trying to find R function for Python's pd.read_sql_query(f, con) where f is my SQL file with code.
So for my connection, it would change from:
con <- function(){
query <- paste0("
SELECT * FROM table1
")
print(queryData(query))
}
con<- data.frame(con())
to
con <- function(){
query <- "SQL_code.sql"
print(queryData(query))
}
con<- data.frame(con())

read your sql from a file:
sql_query <- read.delim('/path/SQL_code.sql', header = FALSE) %>% as.character()
then define the connection and use it:
library(DBI)
db <- dbConnect(...)
dbGetQuery(db, sql_query)

If I understand your question correctly, you could try something like this?
library(DBI)
library(readr)
df <- dbGetQuery(con, statement = read_file('SQL_code.sql'))
# con is your connection
If it does not solve your problem, there may be some solutions here: How to read the contents of an .sql file into an R script to run a query?

Related

Save dplyr query to different schema in dbplyr

I have a JDBC connection and would like to query data from one schema and save to another
library(tidyverse)
library(dbplyr)
library(rJava)
library(RJDBC)
# access the temp table in the native schema
tbl(conn, "temp")
temp_ed <- temp %*% mutate(new = 1)
# I would like to save temp_ed to a new schema "schmema_new"
I would like to use something like dbplyr::compute() but define the output schema specifically. It seems dbplyr::copy_to could be used, but would require bringing the data through the local machine.
I want to use something like RJDBC::dbSendUpdate() but which would ideally integrate nicely with the data manipulating pipeline above.
I do this using dbExecute from the DBI package.
The key idea is to extract the query that defines the current remote table and make this a sub-query in a larger SQL query that writes the table. This requires that (1) the schema exists, (2) you have permission to write new tables, and (3) you know the correct SQL syntax.
Doing this directly might look like:
tbl(conn, "temp")
temp_ed <- temp %*% mutate(new = 1)
save_table_query = paste(
"SELECT * INTO my_database.schema_new.my_table FROM (\n",
dbplyr::sql_render(temp_ed),
"\n) AS sub_query"
)
dbExecute(conn, as.character(save_table_query))
INTO is the clause for writing a new table in SQL server (the flavour of SQL I use). You will need to find the equivalent clause for your database.
In practice I use a custom function that looks something like this:
write_to_database <- function(input_tbl, db, schema, tbl_name){
# connection
tbl_connection <- input_tbl$src$con
# SQL query
sql_query <- glue::glue(
"SELECT *\n",
"INTO {db}.{schema}.{tbl_name}\n",
"FROM (\n",
dbplyr::sql_render(input_tbl),
"\n) AS sub_query"
)
result <- dbExecute(tbl_connection, as.character(sql_query))
}
Applying this in your context:
tbl(conn, "temp")
temp_ed <- temp %*% mutate(new = 1)
write_to_database(temp_ed, "my_database", "schema_new", "my_table")

Reading Data from a SQL server in RStudio (dplyr)

I am sure this question is very basic, but this is the first time I am using R connected to a server, so a few things still confuse me.
I used ODBC Data Sources on Windows to create a DNS, and used
con <- dbConnect(odbc::odbc(), "TEST_SERVER")
this worked, and now under the connection tab I can see the server, and if I double click I can see the databases and tables that exist in the server. How would I go about reading something inside one of those databases?
For Example, if the database name is db1, and the table name is t1, what is the code needed to read that table into local memory? I would prefer using dbplyr as I am familiar with the syntax. I am just unsure how to refer to a particular database and table after making the connection to the server.
I haven't used dbplyr before, but you can query the database using dbGetQuery.
test <- dbGetQuery(
con,
"SELECT *
FROM db1.t1
"
)
You can also pass the database into the connection string.
con <- dbConnect(
drv = odbc(),
dsn = "TEST_SERVER",
database = "db1"
)
And then your query would just be "SELECT * FROM t1".
EDIT: To query the table using dbplyr:
tbl1 <- tbl(con, "t1")
qry <- tbl1 %>% head() %>% collect()
I like to use RODBC-
con <- RODBC::odbcConnect(dsn = 'your_dsn',
uid = 'userid',
pwd = 'password')
table_output <- RODBC::sqlQuery(con, 'SELECT * FROM Table')

syntax for database.table in dbplyr?

I have a connection to our database:
con <- dbConnect(odbc::odbc(), "myHive")
I know this is successful because when I run it, in the top right of RStudio I can see all of our databases and tables.
My question is, how can I select a specific database table combination? The documentation shows a user sleecting a single table, "flights" but I need to do the equivilent of somedatabase.sometable.
Tried:
mytable <- tbl(con, "somedb.sometable")
Error in new_result(connection#ptr, statement) :
nanodbc/nanodbc.cpp:1344: 42S02: [Hortonworks][SQLEngine] (31740) Table or view not found: HIVE..dp_enterprise.uds_order
Then tried:
mytable <- tbl(con, "somedb::sometable")
Error in new_result(connection#ptr, statement) :
nanodbc/nanodbc.cpp:1344: 42S02: [Hortonworks][SQLEngine] (31740) Table or view not found: HIVE..somedb::sometable
I tried removing the quotes "" too.
Within the connections pane of RStudio I can see somedb.sometable. It's there! How can I save it to variable mytable?
You select the database when creating the connection and the table when creating the tbl (with the from argument).
There is no standard interface to dbConnect, so the exact way to pass the database name depends on the DBDriver you use. Indeed DBI::dbConnect is simply a generic dispatching to the driver-specific dbConnect.
In your case, the driver is odbc so you can check out the documentation for odbc::dbConnect and you'll see the relevant argument is database.
This will work:
con <- dbConnect(odbc::odbc(), "myHive", database = "somedb")
df <- tbl(con, from = "sometable")
With most other drivers (e.g. RMariaDB, RMySQL, RPostgres, RSQLite), the argument is called dbname, so you'd do this:
con <- dbConnect(RMariaDB::MariaDB(), dbname = "somedb")
df <- tbl(con, from = "sometable")
I think I found it, use in_schema
mytable <- tbl(con, in_schema("somedb", "sometable"))
This returns a list not a tbl though so I'm not sure.

How can I copy data from several tables and save to several CSV files?

I am trying to use R to copy all data from several tables in SQL Server, to several CSV files on my hard-drive. I guess it will look something like this...
library(RODBC)
library(XLConnect)
dbhandle <- odbcDriverConnect('driver={SQL Server};server=ServerName;database=DB_Name;trusted_connection=true')
# something here...
write.csv(currTableDF,file=paste("C:/Users/file_Name.csv",sep=""),row.names=FALSE)
This works for me...
install.packages("RODBC")
install.packages("dplyr")
library("RODBC")
conn <- odbcDriverConnect('driver={SQL Server};server=ServerName;database=DB_Name;trusted_connection=true')
data <- sqlQuery(conn, "SELECT * FROM Your_Table;")
write.csv(data,file=paste("C:/Users/file_Name.csv",sep=""),row.names=FALSE)

How can I actually see the raw query generated by DBI::dbWriteTable?

I am wondering if there is a way to retrieve the SQL query which dbWriteTable sends to the DBMS. For example, for the following example. Is there a way to get the query?
library(DBI)
con <- dbConnect(RSQLite::SQLite(), ":memory:")
dbWriteTable(con, "mtcars", mtcars[1:10, ])
dbDisconnect(con)
EDIT: 2016-11-11
As a Postgres user, I would be most interested in the commands sent using RPostegreSQL. Following #krlmlr, I figured that there is a function postgresqlCopyInDataframewhich itself calls the C function RS_PostgreSQL_CopyInDataframe. So I hit a dead-end here as C is beyond my skills. Any ideas welcome...
The development version of RSQLite, which will soon hit CRAN, uses DBI::sqlCreateTable(). This function returns SQL that creates a particular table:
con <- dbConnect(RSQLite::SQLite(), ":memory:")
sqlCreateTable(con, "mtcars", mtcars[1:10, ])
dbDisconnect(con)
For other drivers you could look up the method definition in the driver's source code, e.g., via showMethods("dbWriteTable", includeDefs = TRUE) and proceed from there.

Resources