DBConnect to open SQLite in read-only mode in R - r

I'm working on converting a simple Python script to R to act as a template for connecting up to SQLite files. The data is on a NFS mount, and we've run into a few snags in setting up the original Python template (namely, IO errors), but we were able to work around them by connecting with read-only mode and setting the VFS to unix-none, e.g.:
# Python version
path = "file///mnt_nfs/examplepath.edu/path/file.db"
connect = sqlite3.connect(path + "?mode=ro&vfs=unix-none", uri = True)
cur = connect.cursor()
While we know this is far from a perfect solution, it's acting as an interim solution while we set up a more robust database (so our users can still connect to their data in the meantime). However, most of our students are more familiar with R than Python, and I'm having difficulty finding how to recreate the workarounds in R. Is there some way to set dbConnect to include the read-only and unix-none arguments (or equivalent)?
I have the basics, but it's throwing a similar disk IO Error as the Python code was before we added the arguments. I can't seem to find any info on it in the DBI documentation.
connect <- dbConnect(RSQLite::SQLite(), path)

I think it is
dbConnect(SQLite(), dbname=path, flags=SQLITE_RO, vfs="unix-none")
documented at
library(RSQLite)
?`dbConnect,SQLiteDriver-method`

Related

How to Access SAS Server with IOM Protocol in R

I'm trying to read in SAS data into R similarly to this quick tutorial here: https://josezea.wordpress.com/2019/05/02/connect-sas-server-from-r/
The issue is that the server I'm trying to connect to uses an IOM protocol, which doesn't seem to be supported in the RCurl package. Does anyone have any suggestions to reading data from a SAS Server with these protocols in R? It can be reading from a file pathway or a library, either works for my scenario. Thanks!
Below is the code I attempted to run in R:
library(RCurl)
library(haven)
protocol <- "IOM"
server <- "server.com:5555"
userpwd <- "username:password"
sasfileRoute <- "/path_to_data/bonus_schedule.sas7bdat"
## Read Data as data frame
url <- paste0(protocol, "://", server, sasfileRoute)
binary_sasdata <- getBinaryURL(url = url, userpwd=userpwd)
df_data = read_sas(binary_sasdata)
I think you're misunderstanding what the linked page does. It shows how to use R to read in a SAS dataset - but not to connect to SAS.
SAS datasets are usually stored as .sas7bdat files. You should connect via SFTP or network share or similar to access datasets; this won't work if the datasets are stored in a LASR server or other in-memory location of course.
If you need to connect to SAS itself (to execute code or to access in-memory data), you can do so if the SAS server is a SAS Viya server. See R SWAT for more information on that; it uses SAS's APIs to do what you need.
Otherwise, you will have to run the SAS executable from inside R (if you have access to that version of SAS), or have a SAS user export your data for you from inside SAS. I am not familiar with a way to connect to SAS 9 via R directly, and the link in comments seems out of date (CRAN at least doesn't seem to have that package any more).
SASPy does allow Python to do something similar with SAS 9.4, so perhaps that's a better route if you have SAS 9.4.
IOM is a SAS protocol used by its Integration Technologies product. Instead of IOM, find the network path to the file and use it. See resources using IOM (C#, Java, PowerShell, etc.) on the SAS website. Usually, IOM is used for code submission and SAS object management.

How Do I Format A Connection String to Analysis Server in RMDX?

Sorry in advance for asking a very basic/newbie question, but I'm trying to use RMDX to query some data from a Microsoft Analysis Server from RStudio. RMDX is the only package I've been able to successfully install. I've also tried adding X4R w install_github but had some difficulty (and in any case X4R also seems to use a URL as the connection string), and I've tried adding olapR from my RClient library to my R 3.5.2 library, but I get an error about it being made for a version of R with different internals.
RMDX takes a URL as a connection string and I don't know how to format the data connection... correctly, I guess? I've only used sql with RODBC in R before, and setting up a data source via ODBC Data Source Administrator doesn't work for the data warehouse.
Obviously I'm missing a lot of basics/theory/fundamentals so I'm just kind of shooting in the dark, but I've tried "localhost//[server-name]," "https://[server-name]," and copying the connection string used for some of the microsoft bi dashboards that connect to the same data warehouse that I want to query, and none work. Does anyone know how to solve this issue, or can anyone suggest an alternative way of executing MDX queries from RStudio? Thanks!
After experimenting on a similar route like you - I have ended up writing a powershell script that connects to the MS SSAS or OLAP Cube via its "URL" (usually you will use the URL string that has the 'msmdpump.dll' mentioned in it somewhere - usually at the end of it - as the $con or connection string). After that (meaning in the ps script more precisely it is a module) I heavily rely on the AdomdClient object and its properties, something along these lines
#establish SSAS ADOMD Client and open connection
[System.Reflection.Assembly]::LoadWithPartialName(\"Microsoft.AnalysisServices.AdomdClient\") | Out-Null;
\
echo 'Connecting to Database/Cube via Powershell module!'
$con = new-object Microsoft.AnalysisServices.AdomdClient.AdomdConnection($connectionString)
$con.Open()
$command = new-object Microsoft.AnalysisServices.AdomdClient.AdomdCommand($MDXquery, $con)
$dataAdapter = new-object Microsoft.AnalysisServices.AdomdClient.AdomdDataAdapter($command)
$ds = new-object System.Data.DataSet
#fetch data
echo 'fill data container w cube data'
$dataAdapter.Fill($ds)
$con.Close();
....
After that I call this ps script via system2(command = "powershell",...) from within R with the various connectionstring, MDX (query) and so on parameters, save the result in a temp folder as a csv file and then load that back into my R session again.
Hope that helps.

r - SQL on large datasets from several access databases

I'm working on a process improvement that will use SQL in r to work with large datasets. Currently the source data is stored in several different MS Access databases. My initial approach was to use RODBC to read all of the source data into r, and then use sqldf() to summarize the data as needed. I'm running out of RAM before I can even begin use sqldf() though.
Is there a more efficient way for me to complete this task using r? I've been looking for a way to run a SQL query that joins the separate databases before reading them into r, but so far I haven't found any packages that support this functionality.
Should your data be in a database dplyr (a part of the tidyverse) would be the tool you are looking for.
You can use it to connect to a local / remote database, push your joins / filters / whatever there and collect() the result as a data frame. You will find the process neatly summarized on http://db.rstudio.com/dplyr/
What I am not quite certain of - but it is not a R issue but rather an MS Access issue - is the means for accessing data across multiple MS Access databases.
You may need to write custom SQL code for that & pass it to one of the databases via DBI::dbGetQuery() and have MS Access handle the database link.
The link you posted looks promising. If it doesn't yield the intended results, consider linking one Access DB to all the others. Links take almost no memory. Union the links and fetch the data from there.
# Load RODBC package
library(RODBC)
# Connect to Access db
channel <- odbcConnectAccess("C:/Documents/Name_Of_My_Access_Database")
# Get data
data <- sqlQuery(channel , paste ("select * from Name_of_table_in_my_database"))
These URLs may help as well.
https://www.r-bloggers.com/getting-access-data-into-r/
How to connect R with Access database in 64-bit Window?

Connecting to DynamoDB using R

I would like to connect to DynamoDB with R. My ultimate goal is to create a Shiny App to display data that is stored at DynamoDB and updated frequently. So I need an efficient way to retrieve it using R.
The following references give an intuition but they do not include a native implementation in R and have not been updated for a long time.
r language support for AWS DynamoDB
AWS dynamodb support for "R" programming language
R + httr and EC2 api authentication issues
As mentioned in the answers above, running Python within R through rPython would be an option as there are SDKs for Python such as boto3.
Another alternative would be using a JDBC driver through RJDBC, which I tried:
library(RJDBC)
drv <- JDBC(
driverClass = "cdata.jdbc.dynamodb.DynamoDBDriver",
classPath = "MyInstallationDir\lib\cdata.jdbc.dynamodb.jar",
identifier.quote = "'"
)
conn <- dbConnect(
drv,
"Access Key=xxx;Secret Key=xxx;Domain=amazonaws.com;Region=OREGON;"
)
(Access Key and Secret Key replaced by xxx) and I got the error:
Error in .verify.JDBC.result(jc, "Unable to connect JDBC to ", url) :
Unable to connect JDBC to Access Key=xxx;Secret
Key=xxx;Domain=amazonaws.com;Region=OREGON;
What would be the best practice in this matter? Is there a working, native solution for R? I would appreciate if anyone could point me in the right direction.
Note: The package aws.dynamodb (https://github.com/cloudyr/aws.dynamodb) looks promising but the documentation lacks examples and I could not find any tutorial for it.
I would like to share some updates so that people with the same issue can benefit from this post:
First, I figured out how to use the JDBC driver with a few tweaks:
library(DBI)
library(RJDBC)
drv <- JDBC(
driverClass = "cdata.jdbc.dynamodb.DynamoDBDriver",
classPath = "/Applications/CData/CData JDBC Driver for DynamoDB 2018/lib/cdata.jdbc.dynamodb.jar",
identifier.quote = "'"
)
conn <- dbConnect(
drv,
url = 'jdbc:dynamodb: Access Key=xxx; SecretKey=xxx; Domain=amazonaws.com; Region=OREGON;'
)
dbListTables(conn)
Second, I realized that reticulate makes it very convenient (even more than rPython) to run Python code inside R and ended up using reticulated boto3 to get data from DynamoDB into R. You can refer to the following documentations for additional info:
reticulate
boto3 - DynamoDB
Last, I heard that RStudio is planning to build a NoSQL database driver (which would be compatible with DBI, dbplyr, pool etc.) but probably it won't be available sometime soon.
Hope someone will create an R package as comprehensive as boto3 for AWS as it gets more and more popular.

iSeries (AS400) Output with ODBC connection

I am very new to AS400, and I am stuck. I have read documenation but cannot find what I need.
I have an odbc connection to an AS400 server. When I run this command I get an Outfile with everything I need:
CALL QSYS.QCMDEXC('DSPUSRPRF USRPRF(*ALL) OUTPUT(*OUTFILE) OUTFILE(CHHFLE/TEST3)', 0000000061.00000)
Instead of the results going to an outfile I need to receive the results of this command to my script that is connecting through odbc. If I change 'OUTPUT(*OUTFILE)' to 'OUTPUT(*)' I get no results when I try to 'fetchall()'.
Is there any way to get this information through the odbc connection to my script?
EDIT: I am on a linux server, in a python script using pyodbc to connect. I can run sql queries successfully using this connection, but I can't figure out how to get the results of a command to come through as some sort of record set.
I hope I'm interpreting what you're asking correctly. it looks like you're accessing user profile data and dumping it to a file. It looks like you then want to use the contents of that file in a script or something that's running on Windows. If that's the case:
In general, when accessing data in a file from the Windows world, whether through ODBC and VBScript, or .NET, the AS/400 is treated like a database. All files in libraries are exposed via the built-in DB2 database. It's all automatic, and part of the Universal DB2 database.
So, after creating this file, you should have a file named TEST3 in library CHHFLE
You'd create a connection and execute the following SQL statement to read the contents:
Select * From CHHFLE.TEST3
This, of course, assumes that you have proper permissions to access this. You should be able to test this using the iSeries Navigator tool, which includes the ability to run SQL Scripts against the database before doing it in your script.
Added after reading comments above
There's info at this question on connecting to the DB2 from Python. I hope it's helpful.
OUTPUT(*) is not stdout, unfortunately. That means you won't be able to redirect OUTPUT(*) to an ODBC connection. Dumping to a DB2 table via OUTPUT(*OUTFILE) is a good plan. Once that's done, use a standard cursor / fetch loop as though you were working with any other DB2 table.

Resources