I have been using ArcMap to access GIS data on a spatial data server. I want to figure out how to do the same within R.
I know how to read shapefiles into R. I have successfully used maptools and rgdal to open and map locally stored shapefiles (e.g.
http://www.nceas.ucsb.edu/scicomp/usecases/ReadWriteESRIShapeFiles)
My problem is when the data is not stored locally, but rather it is on an Application Server. I believe it's an Oracle database. I've been given information about the 1. Server 2. Instance (a number) 3. Database 4. User and 5. Password. Normally, I would include an example, but it's doubtful that an external user could access the servers.
For example here's how to read and plot local files in R
library(rgdal)
ogrInfo(".", "nw-rivers")
centroids.rg <- readOGR(".", "nw-centroids")
plot(centroids.rg)
The "." points to the local directory. How would I change this to access data on a server? The actual syntax of code would be helpful.
You can read data from Oracle Spatial DBs using GDAL/OGR:
http://www.gdal.org/ogr/drv_oci.html
if you have the driver in your GDAL/OGR installation. If:
require(rgdal)
ogrDrivers()
shows the Oracle driver then you can use readOGR with all the parameters in the right place.
At a guess, and by analogy with the PostGIS example, I'd say try:
dsn="OCI:userid/password#database_instance:")
ogrListLayers(dsn)
s = readOGR(dsn, layername)
but I don't have an Oracle server to test it on (if I did I'd ditch it tomorrow for PostGIS, and spend the license saving on a yacht) and you don't sound certain its an Oracle server anyway. The general principle for connecting to any spatial database is the same - check you have an OGR driver, figure out what the dsn parameter looks like, try it.
Another way is to go via ODBC, or another non-spatial R database connection. However you'll likely get back the spatial data in WKB or WKT form and have to convert to SpatialWhatevers (point, lines, polygons?).
PostGIS example is here:
https://gis.stackexchange.com/questions/64950/which-is-the-best-way-of-working-with-postgis-data-in-r
Related
I am trying to build a fully automated and sustainable reporting tool in Power BI. I have built a report in Power BI that among other things uses R scripting at one point to create a data export to my local C: drive which is the following code:
# 'dataset' holds the input data for this script
.libPaths(.libPaths()[3])
require(gdata)
write.table(trim(dataset), file="C:\\Users\\Username\\OneDrive\\Folder\\Inventory Log.csv", sep=",", row.names=FALSE, append=TRUE, col.names=FALSE)
plot(dataset);
While all my other data is connected to PBI via OneDrive or online sources, this is still connected to my local machine. I have personal gateway setup but that still requires my local machine to be physically on during the scheduled refresh on the PBI service.
I have tried used the Microsoft365R Package but my R knowledge and experience is still limited so I wasn't able to come up with a solution that would allow file="OneDrive Path" within the write.table() function to successfully execute on Power BI Desktop, let alone Power BI Service.
The goal is to fully automate and not require me to have my computer on during the weekends or a non work day.
Is it possible to write a csv to a OneDrive file? If so, what are some ways that have worked successfully?
Any ideas? Thank you for any help you can provide!
Microsoft365R author here. Disclaimer: I'm not familiar with PowerBI, but I assume you can run R scripts inside it and connect to the Internet etc.
There's a few things needed to get this to run unattended.
A function to upload a data frame as CSV to Onedrive, without requiring you to login, is as follows:
upload <- function(dataset, upload_path, drive_id, ...)
{
outfile <- tempfile()
on.exit(unlink(outfile))
write.table(dataset, outfile, ...)
library(Microsoft365R)
library(AzureGraph)
gr <- create_graph_login(
tenant="{yourtenant}",
app="{client_id}",
password="{client_secret}",
auth_type="client_credentials"
)
gr$get_drive(drive_id)$upload_file(outfile, upload_path)
}
On the AAD side, create an app registration and give it a client secret (password). You then give it the Microsoft Graph application permissions necessary to read drives--most likely "Files.Readwrite.All".
Note down the client ID and client secret for your app registration. These are the values you plug into the R function above.
You can get your drive ID with the following R code.
drv <- get_business_onedrive(tenant="yourtenant")
drv$properties$id
You'll probably need the help of your friendly local admin to get all this done, if only because most orgs lock down the ability to register apps.
I'm trying to read in SAS data into R similarly to this quick tutorial here: https://josezea.wordpress.com/2019/05/02/connect-sas-server-from-r/
The issue is that the server I'm trying to connect to uses an IOM protocol, which doesn't seem to be supported in the RCurl package. Does anyone have any suggestions to reading data from a SAS Server with these protocols in R? It can be reading from a file pathway or a library, either works for my scenario. Thanks!
Below is the code I attempted to run in R:
library(RCurl)
library(haven)
protocol <- "IOM"
server <- "server.com:5555"
userpwd <- "username:password"
sasfileRoute <- "/path_to_data/bonus_schedule.sas7bdat"
## Read Data as data frame
url <- paste0(protocol, "://", server, sasfileRoute)
binary_sasdata <- getBinaryURL(url = url, userpwd=userpwd)
df_data = read_sas(binary_sasdata)
I think you're misunderstanding what the linked page does. It shows how to use R to read in a SAS dataset - but not to connect to SAS.
SAS datasets are usually stored as .sas7bdat files. You should connect via SFTP or network share or similar to access datasets; this won't work if the datasets are stored in a LASR server or other in-memory location of course.
If you need to connect to SAS itself (to execute code or to access in-memory data), you can do so if the SAS server is a SAS Viya server. See R SWAT for more information on that; it uses SAS's APIs to do what you need.
Otherwise, you will have to run the SAS executable from inside R (if you have access to that version of SAS), or have a SAS user export your data for you from inside SAS. I am not familiar with a way to connect to SAS 9 via R directly, and the link in comments seems out of date (CRAN at least doesn't seem to have that package any more).
SASPy does allow Python to do something similar with SAS 9.4, so perhaps that's a better route if you have SAS 9.4.
IOM is a SAS protocol used by its Integration Technologies product. Instead of IOM, find the network path to the file and use it. See resources using IOM (C#, Java, PowerShell, etc.) on the SAS website. Usually, IOM is used for code submission and SAS object management.
Currently, using SAS Enterprise Guide, we have some code that pulls data from a data warehouse that has a seemingly straightforward 'CONNECT TO DB2(x,y,z,)' statement in a PROC SQL where x=database name, y=user ID, z=password.
Looking in Tools > Connections > Profiles, I can find the host name and port for the connection. I'm trying to see if there's a way to use this information (and find any other needed information) to connect to the same data warehouse using R. Other posts here on SO have some code using JDBC or RODBC, but I wasn't able to get anywhere with that, as they mention drivers (which I don't know anything about) and the Oracle folder in my C drive didn't seem to have any of that information.
After reaching out to someone involved with the warehouse, their response was "it is a SAS data warehouse and not accessible via direct ODBC connections."
I'm not too sure what other information to ask for or provide for this, but do any of you know if what I'm looking to do is possible? Are SAS Data Warehouses structured some way that would prevent me from accessing it in R? If I can, what else do I need besides the host name, port, database name, user ID, and password?
Thanks in advance, I'm pretty new to all of this
I'm working on a process improvement that will use SQL in r to work with large datasets. Currently the source data is stored in several different MS Access databases. My initial approach was to use RODBC to read all of the source data into r, and then use sqldf() to summarize the data as needed. I'm running out of RAM before I can even begin use sqldf() though.
Is there a more efficient way for me to complete this task using r? I've been looking for a way to run a SQL query that joins the separate databases before reading them into r, but so far I haven't found any packages that support this functionality.
Should your data be in a database dplyr (a part of the tidyverse) would be the tool you are looking for.
You can use it to connect to a local / remote database, push your joins / filters / whatever there and collect() the result as a data frame. You will find the process neatly summarized on http://db.rstudio.com/dplyr/
What I am not quite certain of - but it is not a R issue but rather an MS Access issue - is the means for accessing data across multiple MS Access databases.
You may need to write custom SQL code for that & pass it to one of the databases via DBI::dbGetQuery() and have MS Access handle the database link.
The link you posted looks promising. If it doesn't yield the intended results, consider linking one Access DB to all the others. Links take almost no memory. Union the links and fetch the data from there.
# Load RODBC package
library(RODBC)
# Connect to Access db
channel <- odbcConnectAccess("C:/Documents/Name_Of_My_Access_Database")
# Get data
data <- sqlQuery(channel , paste ("select * from Name_of_table_in_my_database"))
These URLs may help as well.
https://www.r-bloggers.com/getting-access-data-into-r/
How to connect R with Access database in 64-bit Window?
Can I import data directly from a Snowflake database into an R? I was able to do this by creating an ODBC connection with my Snowflake credentials; however, my company switched Snowflake to single sign on and I'm unable to get it to work.
You can certainly connect to Snowflake using R and I would highly recommend that you use RJDBC library. Two requirements are to install the RJDBC library and download snowflake’s jar file. (needless to say, make sure jar file is safe somewhere on your drive that it can not be deleted or moved accidentally). You may pick a version of jar files, say, 3.6.6.
# load library
library(RJDBC)
# specify driver
jdbcDriver <- JDBC(driverClass="net.snowflake.client.jdbc.SnowflakeDriver",
classPath="/home/username/R/snowflake-jdbc-3.6.6.jar") # <-- this is where I saved the jar file
# create a connection
# this is the most critical part.
# you have to make sure you enter your SSO path as well as corp username with domain
con <- dbConnect(jdbcDriver, "jdbc:snowflake://company.us-east-1.snowflakecomputing.com/?authenticator=https://your_domain_name.okta.com/",
'username#domain.com', 'password')
# to query data
# at this point, you are good to go. start querying data.
dbGetQuery(con, "select current_timestamp() as now")
We support OKTA single sign on from ODBC. Please follow https://docs.snowflake.net/manuals/user-guide/odbc-parameters.html for steps to configure your ODBC DSN.