Import SQL table from active connection in R using odbc - r

I have connect Microsoft SQL Server 15 on Windows with a DSN name and i can see this connection in the connection window in Rstudio as the picture shows
But i want to take a specific table from this database AdventureWorks2019 -> Sales -> CreditCard for example ?
What command should i use in order to appear this table as a data frame or tibble in R and play with this data frame ?
I can what the table includes in the view script as you can see.

You have to write a SQL Query and execute it.
You can use the odbc::dbGetQuery() function.
Try this code :
my_dataframe <- odbc::dbGetQuery(conn = con, "SELECT TOP 1000 * FROM Sales.CreditCard")

Related

Retrieve a specific DB project and its tables from SQL Server in R

I am new to using SQL Server from RStudio. I am connected to SQL Server from RStudio and the server has several different projects listed in the below image. For this work I am using odbc library. I am trying to retrieve the tables of a specific project(Project_3960). I have tried dbListTables(conn,"Project_3960") but this command retrieve the tables from all the projects listed in the below Picture. I just want to retrieve the table which are listed in dbo in Project_3690.
The first picture is from RStudio and the second picture is from SQL Management Studio to show the structure of the folders, in case for executing SQL Query.
Thanks
Click on the arrow to the left of the dbo object under Project_3690, and it should show you the tables you have access to. If it does not, then you have a permissions problem and will need to talk with the DBA. That allows you to see them via the GUI. In fact, if you don't already know the names of the tables you should be accessing (such as to follow my code below), then this is the easiest, as they are already filtering out the system and other tables that obscure what you need.
To see them in R code, then dbListTables(conn) will show you all tables, including the ones in the Connections pane I just described but also a lot of system and otherwise-internal tables that you don't need. On my SQL Server instance, it returns over 600 tables, so ... you may not want to do just that, but you can look for specific tables.
For example, if you know you should have tables Table_A and Table_B, then you could do
alltables <- dbListTables(conn)
grep("table_", alltables, value = TRUE, ignore.case = TRUE)
to see all of the table names with that string in its name.
If you do not see tables that you know you need to access, then it is likely that your connection code did not include the specific database, in which case you need something like:
conn <- dbConnect(odbc(), database="Project_3690", uid="...", pwd="...",
server="...", driver = "...")
(Most fields should already be in your connection code, don't use literal ... for your strings.)
One can use a system table to find the other tables:
DBI::dbGetQuery(conn, "select * from information_schema.tables where table_type = 'BASE TABLE' and table_schema = 'dbo'")
# TABLE_CATALOG TABLE_SCHEMA TABLE_NAME TABLE_TYPE
# 1 Project_3690 dbo Table_A BASE TABLE
# 2 Project_3690 dbo Table_B BASE TABLE
# 3 Project_3690 dbo Table_C BASE TABLE
(Notional output but representative of what it should look like.)
Its not quite direct to retrieve the data from SQL server using RStudio when you have different schemas and all are connected to the server. It is easy to view the connected Databases with schema in SQL Server Management Studio but not in RStudio. The easiest way while using Rodbc is to use dot(.) operator and its easy to retrieve the tables of a specific data base is by using "." with dbGetQuery function. I tried dbGetQuery(conn, "select * from project_3690.dbo.AE_ISD ") and it works perfectly fine.

How to use DBI::dbConnect() to read and write tables from multiple databases

I have a Netezza SQL server I connect to using DBI::dbConnect. The server has multiple databases we will name db1 and db2.
I would like to use dbplyr as much as possible and skip having to write SQL code in RODBC::sqlQuery(), but I am not sure how to do the following:.
1) How to read a table in db1, work on it and have the server write the result into a table in db2 without going through my desktop?
2) How to do a left join between a table in db1 and another in db2 ?
It looks like there might be a way to connect to database ="SYSTEM" instead of database = "db1" or "db2", but I am not sure what a next step would be.
con <- dbConnect(odbc::odbc(),
driver = "NetezzaSQL",
database = "SYSTEM",
uid = Sys.getenv("netezza_username"),
pwd = Sys.getenv("netezza_password"),
server = "NETEZZA_SERVER",
port = 5480)
I work around this problem on SQL server using in_schema and dbExecute as follows. Assuming Netezza is not too different.
Part 1: shared connection
The first problem is to connect to both tables via the same connection. If we use a different connection then joining the two tables results in data being copied from one connection to the other which is very slow.
con <- dbConnect(...) # as required by your database
table_1 <- dplyr::tbl(con, from = dbplyr::in_schema("db1", "table_name_1"))
table_2 <- dplyr::tbl(con, from = dbplyr::in_schema("db2.schema2", "table_name_2"))
While in_schema is intended for passing schema names you can also use it for passing the database name (or both with a dot in between).
The following should now work without issue:
# check connection
head(table_1)
head(table_2)
# test join code
left_join(table_1, table_2, by = "id") %>% show_query()
# check left join
left_join(table_1, table_2, by = "id") %>% head()
Part 2: write to datebase
A remote table is defined by two things
The connection
The code of the current query (e.g. the result of show_query)
We can use these with dbExecute to write to the database. My example will be with SQL server (which uses INTO as the keyword, you'll have to adapt to your own environment if the sql syntax is different).
# optional, extract connection from table-to-save
con <- table_to_save$src$con
# SQL query
sql_query <- paste0("SELECT *\n",
"INTO db1.new_table \n", # the database and name you want to save
"FROM (\n",
dbplyr::sql_render(table_to_save),
"\n) subquery_alias")
# run query
dbExecute(con, as.character(sql_query))
The idea is to create a query that can be executed by the database that will write the new table. I have done this by treating the existing query as a subquery of the SELECT ... INTO ... FROM (...) subquery_alias pattern.
Notes:
If the sql query produced by show_query or sql_render would work when you access the database directly then the above should work (all that changes is the command is arriving via R instead of via the sql console).
The functions I have written to smooth this process for me can be found on here. They also include appending, deleting, compressing, indexing, and handling views.
Writing a table via dbExecute will error if the table already exists in the database, so I recommend checking for this first.
I use this work around in other places, but inserting the database name with in_schema has not worked for creating views. To create (or delete) a view I have to ensure the connection is to the database where I want the view.

Connecting to a database within a Redshift database via RStudio using `r-dbi` and `dbplyr` [duplicate]

My question is how can I use dplyr functions, such as tbl, on SQL Server tables that do not use the default "dbo" schema?
For more context, I am trying to apply the R database example given here to my own tables:
https://db.rstudio.com/ (scroll down to the section entitle "Quick Example").
It starts out ok. This first section runs fine:
install.packages("dplyr")
install.packages("odbc")
install.packages("dbplyr")
install.packages("DBI")
con <- DBI::dbConnect(odbc::odbc(),
Driver = "SQL Server",
Server = [My Server Name],
Database = "mydatabase",
UID = [My User ID],
PWD = [My Password],
Port = 1433)
I am able to connect to my SQL Server and load in the tables in my database. I know this because
DBI::dbListTables(con)
returns the names of my available tables (but without any schema).
The next line of example code also works when applied to one of my own tables, returning the names of the columns in the table.
DBI::dbListFields(con, "mytable1")
However, once I try to run the next line:
dplyr::tbl(con, "mytable1")
I get an Invalid object name 'mytable1' error, rather than the expected table preview as in the example.
This error does not arise when I run the same code on a different table, mytable2. This time, as expected, I get a preview of mytable2 when I run:
dplyr::tbl(con, "mytable2")
One difference between mytable1 and mytable2 is the schema. mytable1 uses a made-up "abc" schema i.e. mydatabase.abc.mytable1. mytable2 uses the default "dbo" schema i.e. mydatabase.dbo.mytable2.
I tried dplyr::tbl(con, "abc.mytable1") but I get the same Invalid object name error. Likewise when I tried dplyr::tbl(con, "dbo.mytable2") (although it runs fine when I exclude the dbo part).
So how can I use dplyr functions, such as tbl, on SQL Server tables that do not use the default "dbo" schema? Thanks.
You can use dbplyr::in_schema.
In your case:
dplyr::tbl(con, dbplyr::in_schema("abc", "mytable1"))

Listing database tables for sqlite in R

I am unable to list tables for the sqlite database I am connecting to from R.
I setup the tables in my database (WBS_test1.db) using "DB Browser" https://sqlitebrowser.org/
Looking at this db in the command window, I am able to list the tables via .tables and view the data headers via .schema so I know they are (can also preview in DB Browser of course).
In R however...,
I set my working directory, etc.
library(DBI)
library(RSQLite)
setwd(dir = "C:/here/there/folder")
sqlite <- dbDriver("SQLite")
I then connect to the database and attempt to list the tables and fields in one of the tables specifically
DBtest <- dbConnect(sqlite,"WBS_Test1.db")
dbListTables(DBtest)
dbListFields(DBtest, "WBS_CO2")
I get a "character(0)" returned, which from searching around looks like it's saying the tables are temporary.
I've also tried using the dplyr package
library(dplyr)
# connect to the sqlite file
test_db <- src_sqlite("C:/SQLite/WBS_test.db", create = TRUE)
src_tbls(test_db)
This again returns a "character(0)"
I have no prior experience with SQLite and modest experience in R so I'm probably missing something simple but I can't figure it out. Suggestions??? Maybe I'm not directing my wd in the correct location for the RSQLite package?
Thanks!

What is the best way to write an R data frame to a SQL Server database

I have a R-code data frame that I am trying to write to an existing table in SQL Server. The data frame contains only 8 of about 12 columns contained in the table and the columns in the data frame are not in the same order as the columns in the table. SQL Server is complaining because there are columns missing and other columns that are of the wrong data type (e.g. Varchar (string)vs date, etc.).
I am looking at functions in RODBC and DBI libraries to write the data frame to my SQL Server table, but it is clear that I have to line up the data frame columns in the order expected by the table and to put null place holders in for the missing columns.
What are my options?
Thank you ahead of time for any help you can provide.
So the obvious choice from the RODBC package would be sqlSave(connection,'R data frame','SQL Table'), however as you know that doesn't work. In these cases I write an INSERT INTO statement using sqlQuery().
sqlQuery(connection, "INSERT INTO sqltable (B,C) VALUES ('i1',j1),('i2',j2)...")
Example: We have a SQL table named sqltable with columns A, B, and C. We have an R dataframe with columns B & C named Rdf. With B being of class character and C being of class numeric.
First need to put single quotes around any character fields because we will be using INSERT INTO and SQL likes single quotes around any text
Rdf$B <- paste("'",Rdf$B,"'",sep="")
Next we need to format our data so it will look like the VALUES section of an insert statement
formatRdf <- do.call(paste,c(Rdf,sep=","))
valuesRdf <- paste("(",paste(formatRdf,collapse="),("),")",sep="")
Final step of preparing the INSERT statement
sql_statement <- paste("INSERT INTO sqltable (B,C) VALUES ",valuesRdf,sep="")
Now use sqlQuery
sqlQuery(connection,sql_statement)
If you are looking for performance then probably rsqlserver will be best choice:
Sql Server driver database interface (DBI) driver for R. This is a DBI-compliant Sql Server driver based on the The .NET Framework Data Provider for SQL Server (SqlClient) System.Data.SqlClient.
Motivation
The .NET Framework Data Provider for SQL Server (SqlClient) uses its own protocol to communicate with SQL Server. It is lightweight and performs well because it is optimized to access a SQL Server directly without adding an OLE DB or Open Database Connectivity (ODBC) layer.
In the wiki of the project you will find benchmarks for RODBC, RJDBC and rsqlserver.
Once you have package to talk to database you follow standard DBI examples, so dbWriteTable or dbSendQuery for create/insert/update.

Resources