I am unable to list tables for the sqlite database I am connecting to from R.
I setup the tables in my database (WBS_test1.db) using "DB Browser" https://sqlitebrowser.org/
Looking at this db in the command window, I am able to list the tables via .tables and view the data headers via .schema so I know they are (can also preview in DB Browser of course).
In R however...,
I set my working directory, etc.
library(DBI)
library(RSQLite)
setwd(dir = "C:/here/there/folder")
sqlite <- dbDriver("SQLite")
I then connect to the database and attempt to list the tables and fields in one of the tables specifically
DBtest <- dbConnect(sqlite,"WBS_Test1.db")
dbListTables(DBtest)
dbListFields(DBtest, "WBS_CO2")
I get a "character(0)" returned, which from searching around looks like it's saying the tables are temporary.
I've also tried using the dplyr package
library(dplyr)
# connect to the sqlite file
test_db <- src_sqlite("C:/SQLite/WBS_test.db", create = TRUE)
src_tbls(test_db)
This again returns a "character(0)"
I have no prior experience with SQLite and modest experience in R so I'm probably missing something simple but I can't figure it out. Suggestions??? Maybe I'm not directing my wd in the correct location for the RSQLite package?
Thanks!
Related
I am new to using SQL Server from RStudio. I am connected to SQL Server from RStudio and the server has several different projects listed in the below image. For this work I am using odbc library. I am trying to retrieve the tables of a specific project(Project_3960). I have tried dbListTables(conn,"Project_3960") but this command retrieve the tables from all the projects listed in the below Picture. I just want to retrieve the table which are listed in dbo in Project_3690.
The first picture is from RStudio and the second picture is from SQL Management Studio to show the structure of the folders, in case for executing SQL Query.
Thanks
Click on the arrow to the left of the dbo object under Project_3690, and it should show you the tables you have access to. If it does not, then you have a permissions problem and will need to talk with the DBA. That allows you to see them via the GUI. In fact, if you don't already know the names of the tables you should be accessing (such as to follow my code below), then this is the easiest, as they are already filtering out the system and other tables that obscure what you need.
To see them in R code, then dbListTables(conn) will show you all tables, including the ones in the Connections pane I just described but also a lot of system and otherwise-internal tables that you don't need. On my SQL Server instance, it returns over 600 tables, so ... you may not want to do just that, but you can look for specific tables.
For example, if you know you should have tables Table_A and Table_B, then you could do
alltables <- dbListTables(conn)
grep("table_", alltables, value = TRUE, ignore.case = TRUE)
to see all of the table names with that string in its name.
If you do not see tables that you know you need to access, then it is likely that your connection code did not include the specific database, in which case you need something like:
conn <- dbConnect(odbc(), database="Project_3690", uid="...", pwd="...",
server="...", driver = "...")
(Most fields should already be in your connection code, don't use literal ... for your strings.)
One can use a system table to find the other tables:
DBI::dbGetQuery(conn, "select * from information_schema.tables where table_type = 'BASE TABLE' and table_schema = 'dbo'")
# TABLE_CATALOG TABLE_SCHEMA TABLE_NAME TABLE_TYPE
# 1 Project_3690 dbo Table_A BASE TABLE
# 2 Project_3690 dbo Table_B BASE TABLE
# 3 Project_3690 dbo Table_C BASE TABLE
(Notional output but representative of what it should look like.)
Its not quite direct to retrieve the data from SQL server using RStudio when you have different schemas and all are connected to the server. It is easy to view the connected Databases with schema in SQL Server Management Studio but not in RStudio. The easiest way while using Rodbc is to use dot(.) operator and its easy to retrieve the tables of a specific data base is by using "." with dbGetQuery function. I tried dbGetQuery(conn, "select * from project_3690.dbo.AE_ISD ") and it works perfectly fine.
I have been using dbWriteTable() to save dataframes in R into sql server database for a while.
Normally, I do as following:
library(odbc)
con = odbc::dbConnect(odbc(),
Driver = "SQL Server",
Server = server,
Database = "research",
Trusted_Connection = "True")
table = Id(schema = "schemaName", table = "tableName")
savingResult <- dbWriteTable(con, table, dataframeToSave,append=TRUE,overwrite = FALSE,batch_rows=nrow(dataframeToSave))
It works well until recently I am using the same code to save data into a view in the database. This view has a trigger with it. So when you do insert into this view. The trigger will do some checks and save the data into corresponding tables. Basically, I can treat this view as a table.
When I am doing so, my data team staff told me I am not doing a batch insert. I am inserting the data row by row. So if I am inserting a dataframe with 1000 rows, the trigger has been triggered 1000 times. It causes the saving process become very slow.
I thought dbWriteTable() is doing the batch insert all the time. I read from the internet, lost of posts also mentioning that dbWriteTable() is doing the batch insert. My data team staff asked me if this function has a parameter sth similar to FIRE_TRIGGERS. If so, it may solve the issue. However, it seems dbWriteTable() does not have any parameters similar to this one. Does anyone can confirm with if dbWriteTable() is doing a batch insert? If not, is there a way to do the batch insert?
Now I am working around this issue by using dbWriteTable() to write data into a temp table first, then insert the data from the temp table. In this way, the trigger will be only triggered once and the inserting speed is very fast.But I still want to know if there is an easier way to do this without using a temp table.
Thank you.
I have a Netezza SQL server I connect to using DBI::dbConnect. The server has multiple databases we will name db1 and db2.
I would like to use dbplyr as much as possible and skip having to write SQL code in RODBC::sqlQuery(), but I am not sure how to do the following:.
1) How to read a table in db1, work on it and have the server write the result into a table in db2 without going through my desktop?
2) How to do a left join between a table in db1 and another in db2 ?
It looks like there might be a way to connect to database ="SYSTEM" instead of database = "db1" or "db2", but I am not sure what a next step would be.
con <- dbConnect(odbc::odbc(),
driver = "NetezzaSQL",
database = "SYSTEM",
uid = Sys.getenv("netezza_username"),
pwd = Sys.getenv("netezza_password"),
server = "NETEZZA_SERVER",
port = 5480)
I work around this problem on SQL server using in_schema and dbExecute as follows. Assuming Netezza is not too different.
Part 1: shared connection
The first problem is to connect to both tables via the same connection. If we use a different connection then joining the two tables results in data being copied from one connection to the other which is very slow.
con <- dbConnect(...) # as required by your database
table_1 <- dplyr::tbl(con, from = dbplyr::in_schema("db1", "table_name_1"))
table_2 <- dplyr::tbl(con, from = dbplyr::in_schema("db2.schema2", "table_name_2"))
While in_schema is intended for passing schema names you can also use it for passing the database name (or both with a dot in between).
The following should now work without issue:
# check connection
head(table_1)
head(table_2)
# test join code
left_join(table_1, table_2, by = "id") %>% show_query()
# check left join
left_join(table_1, table_2, by = "id") %>% head()
Part 2: write to datebase
A remote table is defined by two things
The connection
The code of the current query (e.g. the result of show_query)
We can use these with dbExecute to write to the database. My example will be with SQL server (which uses INTO as the keyword, you'll have to adapt to your own environment if the sql syntax is different).
# optional, extract connection from table-to-save
con <- table_to_save$src$con
# SQL query
sql_query <- paste0("SELECT *\n",
"INTO db1.new_table \n", # the database and name you want to save
"FROM (\n",
dbplyr::sql_render(table_to_save),
"\n) subquery_alias")
# run query
dbExecute(con, as.character(sql_query))
The idea is to create a query that can be executed by the database that will write the new table. I have done this by treating the existing query as a subquery of the SELECT ... INTO ... FROM (...) subquery_alias pattern.
Notes:
If the sql query produced by show_query or sql_render would work when you access the database directly then the above should work (all that changes is the command is arriving via R instead of via the sql console).
The functions I have written to smooth this process for me can be found on here. They also include appending, deleting, compressing, indexing, and handling views.
Writing a table via dbExecute will error if the table already exists in the database, so I recommend checking for this first.
I use this work around in other places, but inserting the database name with in_schema has not worked for creating views. To create (or delete) a view I have to ensure the connection is to the database where I want the view.
There's a bug-like phenomenon in the odbc library that has been a known issue for years with the older, slower RODBC library, however the work-around solutions for RODBC do not seem to work with odbc.
The problem:
Very often a person may wish to create a SQL table from a 2-dimensional R object. In this case I'm doing so with SQL Server (i.e. T-SQL). The account used to authenticate, e.g. "sysadmin-account", may be different from the owner and creator of the database that will house tables being created but the account has full read/write permissions for the targeted DB.
The odbc() call to do so goes like this and runs "successfully"
library(odbc)
db01 <- odbc::dbConnect(odbc::odbc(), "UserDB_odbc_name")
odbc::dbWriteTable(db01, "UserDB.dbo.my_table", r_data)
This connects and creates a table, but instead of creating the table in the intended location of UserDB.dbo.my_table, it gets created in UserDB.sysadmin-account.dbo.my_table.
Technically, .dbo is a child of the UserDB database. what this is doing is creating a new child object of UserDB called sysadmin-account with a child .dbo of its own, and then creating the table within there.
With RODBC and some other libraries/languages we found that a work-around solution was to change the reference to the target table location in the call to ".dbo.my_table" or in some cases "..dbo.my_table". Also I think running a query to use UserDB sometimes used to help with RODBC.
None of these solutions seems to any effect with odbc().
Updates
Tried the DBI library as a potential substitute to no avail
Found a work-around of sending the data to a global temp table, then using a SQL statement to copy from the temp table to the intended location
I have an Access database with a couple of tables and I want to work in just one of them. I am using library RODBC. Let's say the table that I want to work it called dtsample. And my Access database is called database.accdb
Here is my code:
library(RODBC)
dataconnect <- odbcConnectAccess2007("database.accdb")
data <- sqlQuery(dataconnect,"SELECT*dtsample columb1, columb2...)
but it does not work. How can I define the table in Access that I want to work with?
Your solution is not really one, because you just got around learning about SELECT
data <- sqlQuery(dataconnect, "SELECT * from dtsample where Columb1 = 'a' or Columb1 ='b')
My suggestion you are not fluent in SQL: Use the query designer in Access, and when it works, get the SQL code generated (View:SQL) and paste it into R.