Connecting R and Redshift via dplyr - r

I am willing to use data from R which is stored in Redshift. I have no problems connecting via the redshift package:
library(redshift)
conn_redshift <- redshift.connect("jdbc:postgresql://host:port/dbname",
user,
password)
table_Names <- redshift.tables(conn_redshift)$table_name
This works fine. However, I am a fan of the dplyr interface, but I have problems to connect via src_postgres. Therefor I first tried to create a local database, this is working as well:
library(dplyr)
connection <- src_postgres(dbname, user,password, host = "localhost")
However, when I add the Redshift credentials
myRedshift2 <- src_postgres(dbname, host ,port, user,password)
I get the following error
Error in postgresqlNewConnection(drv, ...) :
RS-DBI driver: (could not connect ... on dbname)
Since I am able to connect to a postgres database locally it seems that it can not be an installation/driver problem and since I can connect to redshift via another package, it seems that it can not be a permission (IP filter) problem. The code that I have written is working on a colleagues laptop as well, so it shouldn't be code problem neither. We have compared sessionInfo(), but that is exactly the same. Does anyone have another possible idea?

Related

Shiny App Connect dbPool through ODBC or specific driver

I'm struggling to connect my shiny application to one of the database we use in our company. I've succesfully connected to Azure / Mongo / SQL Server databases but now I've got a SAP SQL Anywhere 17 database to connect to.
Not surprisingly there's no specific connection to that database provided in the R Drivers (https://www.rstudio.com/products/drivers/).
Now I can solve this in two ways I believe, our IT department is convinced that a generic ODBC connection should work, or I have to get the specific SQL Anywhere drivers installed on my shiny app somehow.
For both solutions I can't find much online. If I search for generic ODBC connection the recommendations go to FreeTDS which is in the RODBC package, which then might not work together with Pool (according to what I've read).
Searching how to install specific drivers on a shiny application is also not bringing me much.
Try using
options(java.parameters="-Xmx8g")
library(RJDBC)
drv <- JDBC("com.microsoft.sqlserver.jdbc.SQLServerDriver", "drivers/jdbc/mssql-jdbc-8.4.1.jre8.jar", identifier.quote="`")
processingStart = Sys.time()
conn <- dbConnect(drv,
"jdbc:sqlserver://SERVER_NAME;databaseName=DATABASE_NAME",
user = "USER_NAME",
"PASSWORD"
)

Connecting to an Azure SQL Server Data Warehouse from R on a Mac - See random names instead of tables

I'm trying to connect to an Azure SQL Server (12.00.1900) from R on a Mac, using Microsoft's unixodbc SQL Server drivers (17).
I get a connection, but instead of seeing the 12 or so tables that live in the database, dbListTables returns 442 tables, all with nonsensical names, beginning with 'Csoe', 'Ote', and ending in 'xlshm_idad'. Instead of seeing the single schema that lives in the database, I see cin_1mro__e, IFRAINSHM, and s, none of which have any tables in them.
Note that when I use an ordinary SQL visualization app, that doesn't use the MS drivers, I'm able to see the tables and their content properly.
In addition, the RSQLServer package gets a working connection and sees the tables correctly, but isn't compatible with dplyr semantics.
Can anyone help or advise? I've looked for third party SQL Server unixodbc drivers for Mac, and I can't find any.
Until I see more info from OP, I'll leave as my answer the general recommendation to use R's odbc package. Assuming the correct drivers are installed, connection is configured correctly in odbc.ini, and assuming trusted_connection=yes is used in the same, then connecting from R is as simple as:
library(odbc)
dbConn <- dbConnect(odbc(), dsn = "myDSN")
if trusted connection is not on then you just need to pass uid and pwd arguments.
Also, it may be the case OP that you did not install freeTDS, so try (replace with equivalent for package manager you're using):
brew install freetds --with-unixodbc
This gives you the libtdsodbc.so driver. Make sure the DSN points to this.

Trouble connecting to Oracle database using RODBC

I recently upgraded from Windows 7 to Windows 10 and had to reset some remote database connections. I had previously been connecting quite successfully to an Oracle database using the Oracle 11g client and RODBC.
library(RODBC)
channel<-
odbcConnect(dsn="myoracleDB",
uid='myusername',
pw='mypassword',
believeNRows=FALSE)
result<- sqlQuery(channel,"select * from schema_name.table_name")
close(channel)
Since the Windows 10 upgrade, the above connection protocol no longer works. Specifically, I get the following error:
channel<-
odbcConnect(dsn="myoracleDB",
uid='myusername',
pw='mypassword',
believeNRows=FALSE)
Warning messages:
1: In RODBC::odbcDriverConnect("DSN=myoracleDB;UID=myusername;
PWD=mypassword",:
[RODBC] ERROR: state HY000, code 12170, message [Oracle][ODBC]
[Ora]ORA-12170: TNS:Connect timeout occurred
2: In RODBC::odbcDriverConnect("DSN=myoracleDB;UID=myusername;
PWD=mypassword",:ODBC connection failed
Two additional observations are relevant here:
I use the Windows command line to execute tnsping myoracleDB which returns a successful connection to the database
I can also use Oracle's SQL Developer Application to successfully connect to and query from the database.
So I feel confident that the Oracle Client and the ODBC Data Sources are set up correctly.
Interestingly, I AM able to connect to my database using the RODBC library if I use the following code:
mycon = odbcDriverConnect("Driver={Oracle in OraClient11g_home1};
Dbq=myoracleDB; Uid=myusername; Pwd=mypassword;",
believeNRows=FALSE)
My question for the community is:
This new connection protocol works (which I'm happy about). However, since I don't really understand why it works when the approach that worked before no longer works, I fear I may be ignoring some underlying problem that could really hurt me down the road.
I have found the following SO threads to be helpful, though neither really addresses my issue exactly:
Failure to connect to odbc database in R
Connect to ORACLE via R, using the info in sql developer
UPDATE:
I have accessed the Windows ODBC 64 bit menu and verified that I do have a DSN called "myoracleDB" which is assigned to the "Oracle in OraClient11g_home1" driver. I have tested this connection and find that it works fine. I have also used the RODBC line:
odbcDataSources()
in RStudio and found that the data source "myoracleDB" is recognized. However, when I try to execute:
channel<-
odbcConnect(dsn="myoracleDB",
uid='myusername',
pw='mypassword',
believeNRows=FALSE)
I still get the error:
"TNS: Connect timeout occurred ODBC connection failed"
If you check out the docs, DSN=myoracleDB tells RODBC to connect to the Windows DSN "myoracleDB", while Dbq=myoracleDB tells RODBC to connect to the TNSNAMES entry "myoracleDB". They're two different ways of resolving database names. tnsping and SQL Developer also both use TNSNAMES to resolve databases.
So I think your DSN probably got deleted when you reset things. You can test it by going to Control Panel > Administrative Tools > Data Sources (ODBC). If your database is there, you should be able to Configure it and click Test Connection to make sure it's working. Otherwise you can add it there, and your original configuration should work again.

RCassandra is not connecting to Cassandra Database

I'm new to Cassandra and R. When I'm connecting to Cassandra database using RCassandra package, connection is establishing. But When trying to use any keyspace, R is not responding. I used the following statements.
c <- RC.connect('192.168.1.20', 9042)
RC.use(c, 'effesensors')
Please give me a brief idea about how to use RCassandra to avoid this problem.
Are you aware that you may be using a non default port for Cassandra? If you can provide the Cassandra version and RStudio version I may be able to update my answer. I found this tutorial by tarkalabs useful as a checklist of steps to take before any connection is attempted.
From the tutorial,
Now connect to your database with connect.handle <-
RC.connect(host="127.0.0.1", port=9160)
Cassandra by default listens to port 9160 but you can change it
according to your configuration. To show the cluster type into your
prompt RC.cluster.name(connect.handle)
Just to verify that you are connected and your Cassandra instance is running try the following command:
RC.describe.keyspaces(connect.handle)
That should bring back a list of the settings in your keyspaces. If nothing returns, you are either not connected or your Cassandra instance is not properly installed.
EXAMPLE OUTPUT
$system_traces$strategy_options
replication_factor
"2"
$system_traces$cf_defs
named list()
$system_traces$durable_writes
[1] TRUE
Let me know what your results are if my answer does not work and I will update my answer. Good Luck!
make use of RODBC instead of using RCassandra. We need to install Cassandra ODBC driver.
Thanks #D. Venkata Naresh, your suggestion of using RODBC driver resolved my issue.
I am using R and datastax cassandra community edition.
This is the link I followed to configure the ODBC driver in my windows machine.
https://www.datastax.com/dev/blog/using-the-datastax-odbc-driver-for-apache-cassandra
Then, in my R studio, These are the commands to connect and fetch from the Cassandra
install.packages("RODBC")
library("RODBC")
require("RODBC")
conn <- odbcConnect(<ODBC datasource name>)
dataframe <- sqlFetch(conn, <column family / table name>)
dataframe
Hope, this answer helps someone who is facing issue with RCassandra.
I read your comments above, you are using the wrong port. You should run the following command
c <- RC.connect('192.168.1.20', 9160)
This will definitely work for you.

Using dplyr to connect to SSL-encrypted remote database

I would like to use the dplyr package in R but to connect to a remote database that is SSL-encrypted. How do I set up a workaround here? I'm thinking of setting up a backend that uses the RODBC package. Is this possible?
Actually you can connect to a an SSL-encrypted connection with dplyr and it's easy.
You just need to pass the parameters for your connection within the dbname parameter, like this (this is a postgresql example):
db <- src_postgres(dbname="dbname=my_db sslcert=my_cert.crt sslkey=my_key.key sslmode=require", user="username", host="your.host.com")

Resources