Import data from Snowflake with single sign on into R - r

Can I import data directly from a Snowflake database into an R? I was able to do this by creating an ODBC connection with my Snowflake credentials; however, my company switched Snowflake to single sign on and I'm unable to get it to work.

You can certainly connect to Snowflake using R and I would highly recommend that you use RJDBC library. Two requirements are to install the RJDBC library and download snowflake’s jar file. (needless to say, make sure jar file is safe somewhere on your drive that it can not be deleted or moved accidentally). You may pick a version of jar files, say, 3.6.6.
# load library
library(RJDBC)
# specify driver
jdbcDriver <- JDBC(driverClass="net.snowflake.client.jdbc.SnowflakeDriver",
classPath="/home/username/R/snowflake-jdbc-3.6.6.jar") # <-- this is where I saved the jar file
# create a connection
# this is the most critical part.
# you have to make sure you enter your SSO path as well as corp username with domain
con <- dbConnect(jdbcDriver, "jdbc:snowflake://company.us-east-1.snowflakecomputing.com/?authenticator=https://your_domain_name.okta.com/",
'username#domain.com', 'password')
# to query data
# at this point, you are good to go. start querying data.
dbGetQuery(con, "select current_timestamp() as now")

We support OKTA single sign on from ODBC. Please follow https://docs.snowflake.net/manuals/user-guide/odbc-parameters.html for steps to configure your ODBC DSN.

Related

When connecting R to Microsoft SQL Server, do you have to use a DSN?

I want to connect R to SQL Server so I can export some R data frames as tables to SQL Server.
From a few online tutorials, I've seen they use the RODBC package, and it seems that you first need to create an ODBC name first, by going to ODBC Data sources (64-bit) > System DSN > Add > SQL Server Native Client 11.0> and then insert your specifications.
I have no idea how databases are managed, so forgive my ignorance here.. my question is: if there is already a database/server set up on SQL Server, particularly also where I want to export my R data to, do I still need to do this?
For instance, when I open Microsoft SQL Server Management Studio, I see the following:
Server type: Database Engine
Server name: example.server.myorganization.com
Authentication: SQL Sever Authentication
Login: organization_user
Password: organization_password
After logging in, I can access a database called "Organization_Division_DBO" > Tables which is where I want to upload my data from R as a table. Does this mean the whole ODBC shebang is already setup for me, and I can skip the steps mentioned here where an ODBC needs to be set up?
Can I instead use the code shown here:
library(sqldf)
library(odbc)
con <- dbConnect(odbc(),
Driver = "SQL Server",
Server = "example.server.myorganization.com",
Database = "Organization_Division_DBO",
UID = "organization_user",
PWD = "organization_password")
dbWriteTable(conn = con,
name = "My_R_Table",
value = ) ## x is any data frame I have in R
I note that on this page they use a similar code to above (what is port number?) and also there is some mention "that there is also support for DSNs", so I am a little confused. Also, is there any advantage/disadvantage over using the ODBC package over the RODBC package to do this?

How to use DSN file for connection to SQL Server using R

I want to use a DSN file to connect to a SQL Server database with the R library RODBC. I am able to create a User DSN and connect using the uid and pwd, but I want to exclude those from the script if possible. I have my dsn in the working directly for R which is validated by using getwd().
The examples I see show this command:
odbcConnect(dsn = "<dsn_file>").
I have tried using the DBI and RODBC packages using dbconnect and odbcConnect.
dbConnect(odbc::odbc(),
"user_dsn",uid="username",pwd="password",database="db_name")
I would like to connect to the database without showing the uid and pwd in the script. I want to use a dsn file, So I can change the credentials in one place rather than in many scripts.
I have a DSN called "phone" that connects to our phone database. To connect and then query that database, I just do the following:
library(RODBC)
phone <- odbcConnect("phone")
I don't have access to a system set up to test it but I would check this link on how to connect Setting up R to connect to SQL Server
and this one on SO for info about the RODBC library.
Using a DSN
con <- dbConnect(odbc::odbc(), "mydbalias")
here are information about Securing Credentials
With odbc package I use filedsn parametr (described on microsoft page). I think it should work with RODBC.
db_conn <- dbConnect(odbc::odbc(), filedsn="path/to/my/file/user_dsn.dsn")

Connecting to DynamoDB using R

I would like to connect to DynamoDB with R. My ultimate goal is to create a Shiny App to display data that is stored at DynamoDB and updated frequently. So I need an efficient way to retrieve it using R.
The following references give an intuition but they do not include a native implementation in R and have not been updated for a long time.
r language support for AWS DynamoDB
AWS dynamodb support for "R" programming language
R + httr and EC2 api authentication issues
As mentioned in the answers above, running Python within R through rPython would be an option as there are SDKs for Python such as boto3.
Another alternative would be using a JDBC driver through RJDBC, which I tried:
library(RJDBC)
drv <- JDBC(
driverClass = "cdata.jdbc.dynamodb.DynamoDBDriver",
classPath = "MyInstallationDir\lib\cdata.jdbc.dynamodb.jar",
identifier.quote = "'"
)
conn <- dbConnect(
drv,
"Access Key=xxx;Secret Key=xxx;Domain=amazonaws.com;Region=OREGON;"
)
(Access Key and Secret Key replaced by xxx) and I got the error:
Error in .verify.JDBC.result(jc, "Unable to connect JDBC to ", url) :
Unable to connect JDBC to Access Key=xxx;Secret
Key=xxx;Domain=amazonaws.com;Region=OREGON;
What would be the best practice in this matter? Is there a working, native solution for R? I would appreciate if anyone could point me in the right direction.
Note: The package aws.dynamodb (https://github.com/cloudyr/aws.dynamodb) looks promising but the documentation lacks examples and I could not find any tutorial for it.
I would like to share some updates so that people with the same issue can benefit from this post:
First, I figured out how to use the JDBC driver with a few tweaks:
library(DBI)
library(RJDBC)
drv <- JDBC(
driverClass = "cdata.jdbc.dynamodb.DynamoDBDriver",
classPath = "/Applications/CData/CData JDBC Driver for DynamoDB 2018/lib/cdata.jdbc.dynamodb.jar",
identifier.quote = "'"
)
conn <- dbConnect(
drv,
url = 'jdbc:dynamodb: Access Key=xxx; SecretKey=xxx; Domain=amazonaws.com; Region=OREGON;'
)
dbListTables(conn)
Second, I realized that reticulate makes it very convenient (even more than rPython) to run Python code inside R and ended up using reticulated boto3 to get data from DynamoDB into R. You can refer to the following documentations for additional info:
reticulate
boto3 - DynamoDB
Last, I heard that RStudio is planning to build a NoSQL database driver (which would be compatible with DBI, dbplyr, pool etc.) but probably it won't be available sometime soon.
Hope someone will create an R package as comprehensive as boto3 for AWS as it gets more and more popular.

R and odbcDriverConnect() to connect R to teradata

I am trying to connect R to Teradata and am not sure what the input items are to the RODBC::odbcDriverConnect(). There is a teradataR package, but it is only used with R versions 3 and under, which I neither have nor want to switch to. Below is a list of the input parameters to get ODBCDriverConnect to work. "Connection" I believe is most important. I need to get an address for a driver that I don't even know if I have. This is what I need most help with. How do I get a driver for Teradata to connect to R? IT at my work is not sure how to do this. Also, if anyone knows of another way to connect Teradata to R (some other package?), please let me know.
connection = ""
case
believeNRows = TRUE
colQuote, tabQuote = colQuote
interpretDot = TRUE
DBMSencoding = "",
rows_at_time = 100
readOnlyOptimize = FALSE
Thank you for your help!
I was able to connect R to Teradata using RODBC package. Here is how to do it if you are working on a pc and have a Teradata driver.
Set up DSN:
Go to: control panel-> administrative tools -> Data Sources (ODBC) -> User DSN tab -> click add-> select Teradata driver (or whatever driver you will be using. ie. could be sql) and press finish.
A box will pop up that needs to be filled in. The following fields need to be filled:
Name: Can be any name you would like. I chose TeraDataRConnection, for example.
Name or IP address (DBC name or address): Mine for example is: Databasename.companyname.com. I looked to see how Microsoft access was connected to the database and in doing that, found the DBC address.
Username: username that you use to connect to database.
Password: password use to connect to databases (if you don't put your password in here, you will have to manually type it into R every time you connect.
In R:
Download RODBC package
library(RODBC)
ch=odbcConnect("TeraDataRConnection", uid="USERNAME HERE",pwd="PASSWORD HERE")
If you want to confim you are connected, you can type in this code to see the tables:
ListOfTables=sqlTables(ch,tableType="TABLE")
That's it!
I am able to connect to Teradata and created a Shiny app which reads data from it.
Firstly we need to install RODBC package in our R. Prerequisite of it is R (≥ 4.0.0) version. No admin access is required to upgrade R even in enterprise laptops.
Follow below steps to successfully setup connection.
Create ODBC Data Sources to connect to Teradata. The connection should be either in 64bit or 32bit, depending on R software.
Use below code snippet to get the data into reactive variable
data <- reactive({
ch <- odbcConnect(dsn = "DSNName", uid = "username", pwd = "password")
sqlQuery(ch,paste('select * from emp ')
})
DSNName - Name of DSN connection created
You can use data() to display and use the value stored in it.
Enjoy!

iSeries (AS400) Output with ODBC connection

I am very new to AS400, and I am stuck. I have read documenation but cannot find what I need.
I have an odbc connection to an AS400 server. When I run this command I get an Outfile with everything I need:
CALL QSYS.QCMDEXC('DSPUSRPRF USRPRF(*ALL) OUTPUT(*OUTFILE) OUTFILE(CHHFLE/TEST3)', 0000000061.00000)
Instead of the results going to an outfile I need to receive the results of this command to my script that is connecting through odbc. If I change 'OUTPUT(*OUTFILE)' to 'OUTPUT(*)' I get no results when I try to 'fetchall()'.
Is there any way to get this information through the odbc connection to my script?
EDIT: I am on a linux server, in a python script using pyodbc to connect. I can run sql queries successfully using this connection, but I can't figure out how to get the results of a command to come through as some sort of record set.
I hope I'm interpreting what you're asking correctly. it looks like you're accessing user profile data and dumping it to a file. It looks like you then want to use the contents of that file in a script or something that's running on Windows. If that's the case:
In general, when accessing data in a file from the Windows world, whether through ODBC and VBScript, or .NET, the AS/400 is treated like a database. All files in libraries are exposed via the built-in DB2 database. It's all automatic, and part of the Universal DB2 database.
So, after creating this file, you should have a file named TEST3 in library CHHFLE
You'd create a connection and execute the following SQL statement to read the contents:
Select * From CHHFLE.TEST3
This, of course, assumes that you have proper permissions to access this. You should be able to test this using the iSeries Navigator tool, which includes the ability to run SQL Scripts against the database before doing it in your script.
Added after reading comments above
There's info at this question on connecting to the DB2 from Python. I hope it's helpful.
OUTPUT(*) is not stdout, unfortunately. That means you won't be able to redirect OUTPUT(*) to an ODBC connection. Dumping to a DB2 table via OUTPUT(*OUTFILE) is a good plan. Once that's done, use a standard cursor / fetch loop as though you were working with any other DB2 table.

Resources