I am looking to analyse data in R (using dplyr) contained in an Access database on my laptop. (My first time trying to set up a database connection in R.)
Looking at the tidyverse site, for dplyr to work on the Access data, it seems that the connection must be via the DBI package (rather than RODBC).
I'm struggling with the syntax of dbConnect.
My code for RODBC was
base1<-odbcDriverConnect("Driver={Microsoft Access Driver (*.mdb, *.accdb)};DBQ=[filepath]/AdventureWorks DW 2012.accdb")
My (failed) attempt for DBI is
DB <- dbConnect(drv=Microsoft Access Driver (*.mdb, *.accdb)), host=[filepath]/AdventureWorks DW 2012.accdb)
What am I doing wrong?
(I'm working on Windows 10 - everything 64 bit.)
I recently needed to convert my RODBC defined db connections to equivalent DBI connections. Here's the original RODBC function:
connect_to_access_rodbc <- function(db_file_path) {
require(RODBC)
# make sure that the file exists before attempting to connect
if (!file.exists(db_file_path)) {
stop("DB file does not exist at ", db_file_path)
}
# Assemble connection strings
dbq_string <- paste0("DBQ=", db_file_path)
driver_string <- "Driver={Microsoft Access Driver (*.mdb, *.accdb)};"
db_connect_string <- paste0(driver_string, dbq_string)
myconn <- odbcDriverConnect(db_connect_string)
return(myconn)
}
As explained here, the dbplyr package is built from the DBI package. The first argument of the DBI::dbConnect() must be an appropriate back-end driver. See the link for a list of drivers. For Access, the odbc::odbc() driver is suitable. The second argument the dbConnect function is the full connection string as used in the previous odbcDriverConnect call. With that in mind the following function should connect to your access database:
connect_to_access_dbi <- function(db_file_path) {
require(DBI)
# make sure that the file exists before attempting to connect
if (!file.exists(db_file_path)) {
stop("DB file does not exist at ", db_file_path)
}
# Assemble connection strings
dbq_string <- paste0("DBQ=", db_file_path)
driver_string <- "Driver={Microsoft Access Driver (*.mdb, *.accdb)};"
db_connect_string <- paste0(driver_string, dbq_string)
myconn <- dbConnect(odbc::odbc(),
.connection_string = db_connect_string)
return(myconn)
}
The odbc package documentation presents a more nuanced example as well:
https://github.com/r-dbi/odbc#odbc
As explained here, you can still use odbcConnectAccess2007() but need 32-bit MS Access 2007 drivers from here. This worked great for me.
I just used this a couple days ago.
library(RODBC)
# for 32 bit windows
# Connect to Access db
# channel <- odbcConnectAccess("C:/Users/Excel/Desktop/Coding/Microsoft Access/Northwind.mdb")
# Get data
# data <- sqlQuery( channel , paste ("select * from Name_of_table_in_my_database"))
# for 64 bit windows
channel <- odbcDriverConnect("Driver={Microsoft Access Driver (*.mdb, *.accdb)};DBQ=C:/Users/Excel/Desktop/Coding/Microsoft Access/Northwind.mdb")
data <- sqlQuery( channel , paste ("select * from CUSTOMERS"))
odbcCloseAll()
Related
I'm trying to load a very large dataset into R that is stored in SQL. I am able to use local R (Microsoft R Client 3.3.3.0) to connect to SQL via the following:
library(RODBC)
db <- odbcDriverConnect('driver={SQL Server};server=servername\\servername;database=dbname;trusted_connection=true')
tb <- paste("select top 100 * from dbname", sep = "")
df <- sqlQuery(db, tb)
And all that works fine. However, when I try to do this in a remote session (MMLS R version 3.4.3), it doesn't work:
library(mrsdeploy)
remoteLogin("http://some.url", session = TRUE)
REMOTE> #insert script from above
which returns the following error message:
Error in sqlQuery(db, tb) : first argument is not an open RODBC channel
The problem is with odbcDriverConnect(), as running that locally returns, as expected, an object of class "RODBC" delineating the RODBC connection details but running that remotely returns a scalar of class "integer" (-1).
Am I doing something wrong or is it not possible to remotely connect to a SQL database while remotely connected to an R server?
I am trying establish a connection to my redshift database after following the example provided by AWS https://blogs.aws.amazon.com/bigdata/post/Tx1G8828SPGX3PK/Connecting-R-with-Amazon-Redshift. However, I get errors when trying to establish the connection using their recommended driver. However, when I use the Postgre driver I can establish a connection to the redshift DB.
AWS says their driver is "optimized for performance and memory management", so I would rather use it. Can someone please review my code below, and let me know if they see something wrong? I suspect that I am not setting the URL up correctly, but not sure what I should be using instead? Thanks in advance for any help.
#' This code attempts to establish a connection to redshift database. It
#' attempts to establish a connection using the suggested redshift but doesn't
#' work.
## Clear up space and set working directory
#Clear Variables
rm(list=ls(all=TRUE))
gc()
## Libriries for analyis
library(RJDBC)
library(RPostgreSQL)
#Create DBI driver for working with redshift driver directly
# download Amazon Redshift JDBC driver
download.file('http://s3.amazonaws.com/redshift-downloads/drivers/RedshiftJDBC41-1.1.9.1009.jar',
'RedshiftJDBC41-1.1.9.1009.jar')
# connect to Amazon Redshift using specific driver
driver_redshift <- JDBC("com.amazon.redshift.jdbc41.Driver",
"RedshiftJDBC41-1.1.9.1009.jar", identifier.quote="`")
## Using postgre connection that works
#postgre driver
driver_postgre <- dbDriver("PostgreSQL")
#establish connection
conn_postgre <- dbConnect(driver_postgre, host="nhdev.c6htwjfdocsl.us-west-2.redshift.amazonaws.com",
port="5439",dbname="dev",
user="xxxx", password="xxxx")
#list the tables available
tables = dbListTables(conn_postgre)
## Use URL option to establish connection like the example on AWS website
# url <- "<JDBCURL>:<PORT>/<DBNAME>?user=<USER>&password=<PW>
# url <- "jdbc:redshift://demo.ckffhmu2rolb.eu-west-1.redshift.amazonaws.com
# :5439/demo?user=XXX&password=XXX" #useses example from AWS instructions
#url using my redshift database
url <- "jdbc:redshift://nhdev.c6htwjfdocsl.us-west-2.redshift.amazonaws.com
:5439/dev?user=xxxx&password=xxxx"
#attempt connect but gives an error
conn_redshift <- dbConnect(driver_redshift, url)
#gives the following error:
# Error in .jcall(drv#jdrv, "Ljava/sql/Connection;", "connect", as.character(url)[1], :
# java.sql.SQLException: Error message not found: CONN_GENERAL_ERR. Can't find bundle for base name com.amazon.redshift.core.messages, locale en
## Similier to postgre example that works but doesn't work when using redshift specific driver
#gives an error saying url is missing, but I am not sure which url to use?
conn <- dbConnect(driver_redshift, host="nhdev.c6htwjfdocsl.us-west-2.redshift.amazonaws.com",
port="5439",dbname="dev",
user="xxxx", password="xxxx")
# gives the following error:
#Error in .jcall("java/sql/DriverManager", "Ljava/sql/Connection;", "getConnection", :
# argument "url" is missing, with no default
I've done it this way it works for me:
drv <- JDBC("com.amazon.redshift.jdbc41.Driver","PathTO/RedshiftJDBC41-1.1.2.0002.jar")
conn <- dbConnect(drv,"jdbc:redshift://......redshift.amazonaws.com:5439/dev",User,PWD)
The difference I see in yours is that you don't mention the full path to redshift jar in driver_redshift.
Hope it works.
I can not connect to MSSQL using DBI package.
I am trying the way shown in package itself
m <- dbDriver("RODBC") # error
Error: could not find function "RODBC"
# open the connection using user, passsword, etc., as
# specified in the file \file{\$HOME/.my.cnf}
con <- dbConnect(m, dsn="data.source", uid="user", pwd="password"))
Any help appreciated. Thanks
As an update to this question: RStudio have since created the odbc package (or GitHub version here) that handles ODBC connections to a number of databases through DBI. For SQL Server you use:
con <- DBI::dbConnect(odbc::odbc(),
driver = "SQL Server",
server = <serverURL>,
database = <databasename>,
uid = <username>,
pwd = <passwd>)
You can also set a dsn or supply a connection string.
It looks like there used to be a RODBC driver for DBI, but not any more:
http://cran.r-project.org/src/contrib/Archive/DBI.RODBC/
A bit of tweaking has got this to install in a version 3 R but I don't have any ODBC sources to test it on. But m = dbDriver("RODBC") doesn't error.
> m = dbDriver("RODBC")
> m
<ODBCDriver:(29781)>
>
Suggest you ask on the R-sig-db mailing list to maybe find out what happened to this code and/or the author...
Solved.
I used library RODBC. It has great functionality to connect sql and run sql queries in R.
Loading Library:
library(RODBC)
# dbDriver is connection string with userID, database name, password etc.
dbhandle <- odbcDriverConnect(dbDriver)
Running Sql query
sqlQuery(channel=dbhandle, query)
Thats It.
I'm trying to retrieve data from greenplum cluster into R (win client).
I've tried:
library("RODBC")
conn <- odbcDriverConnect("DSN_name")
Sql <- "select * from DB.st.country"
cen_data <- sqlQuery(conn,Sql)
print(cen_data)
I'm getting error:
0A000 7 ERROR: cross-database references are not implemented
I have seen some answers about dblink but when I tried:
sql <- "select dblink_connect('conn', 'dbname=myDB');"
cen_data <- sqlQuery(conn,Sql)
I'm getting error:
"42883 7 ERROR: function dblink_connect(unknown, unknown) does not exist
Does anyone have any idea what Am I doing wrong?
Instead of ODBC, you can also use the RPostgreSQL package, which uses DBI as the backend.
drv <- dbDriver("PostgreSQL")
con <- dbConnect(drv, host="hostname", user=..., pass=...)
This is not an R issue (syntax is ok).
The problem was in database definitions.
You need to have the database in the "select data source". for that you need to have postgreSQL.
The following code connects to my PostgreSQL database successfully (or appears to, at any rate), but attempt to issue queries were met with "relation does not exist" errors, so I tried dbListTables, which doesn't return any tables at all. The database name passed to dbConnect is correct, and the tables do exist. I think the code I'm using is exactly the same as what I was using recently, which worked successfully. Any ideas?
> library(RPostgreSQL)
Loading required package: DBI
> drv <- dbDriver("PostgreSQL")
> con <- dbConnect(drv, dbname="mydb", user="user", password=password)
> dbListTables(con)
character(0)
I'm new to both R and DBI, so I'm sure I could be missing something extremely simple...any help would be appreciated.
Solved-- I was right; it was something incredibly simple (and very, very stupid) on my part. I was running the script from the wrong server. The server I was running it from has an empty copy of the database I was attempting to connect to, so everything succeeded, and the empty result from dbListTables was correct. Once I switched servers (or simply specified the host on the other server), everything worked.
1.Connet to MySQL
a)if Mysql is installed in your system, if not install it.
b)download the RMySQL IN R
library(RMySQL)
drv = dbDriver("MySQL 5.0.1")
make sure MySQL version is correct.
con = dbConnect(drv,host="localhost",dbname="test",user="root",pass="root")
use local host or use the server i.e ip address
use the required database name, user name and password
album = dbGetQuery(con,statement="select * from table")
run required query
close(con)
2.Another way to connect database
a)first install any database like MySQL,Oracle,SQL Server
b)install the ODBC connector for database
library(Rodbc)
channel <- odbcConnect("test", uid="ripley", pwd="secret")
test is the connection name of odbc conector which user has to set manualy
user can find this in Administrator tool
res <- sqlFetch(ch, "table name")
A table can be retrieved as a data frame
res<-sqlQuery(channel, paste("select query"))
part of the with condition one table can be retrieved as a data frame
sqlSave(channel, dataframe)
to save a dataframe to the database(dont use "res<-" something like this)
like user can use
sqlCopy()
sqlDrop()
sqlTables()
close(channel)
always close the connection