R stop function on multiple objects - r

I have a simple script that has 10 sqlquery and sqlSave functions.
I am trying to create a sql table to capture when an error or warning occurs.
Example I can write
print("Sql QueryA loading!") into a table but in my case sql.
then the sqlquery/sqlsave will run and then i write another print"Sql QueryA Saved"
I can use
if(NROW(mysqlquery)!=0{sqlqA= sqlSave(serverdb,"tablename"and so on)}
else{stop(sqlSave("my error table" and so on)}
But i would have to do this for every object, I dont understand how I can do this for all my objects for example if NROWS SqlA >0 then execute sqlsave if NROWS SqlB =0 the stop any code below and write the error to the error table.
This is probably a duplicate but looking at other questions i cant understand how i can get this to work. thanks to anyone that might be able to help.

Related

RODBC connection issue

I am trying to use RODBC to connect to an access database. I have used the same structure several times in this project with success. However, in this instance it is now failing and I cannot figure out why. The code is not really reprex as I can't provide the DB, but...
This works for a single table:
library(magrittr);library(RODBC)
#xWalk_path is simply the path to the accdb
#xtabs generated by querying the available tables
x=1
tab=xtabs$TABLE_NAME[x]
temp<-RODBC::odbcConnectAccess2007(xWalk_path)%>%
RODBC::sqlFetch(., tab, stringsAsFactors = FALSE)
odbcCloseAll()
#that worked perfectly
However, I really want to use this in a a function so I can read several similar tables into a list. As a function it does not work:
xWalk_ls<- lapply(seq_along(xtabs$TABLE_NAME), function(x, xWalk_path=xWalk_path, tab=xtabs$TABLE_NAME[x]){
#print(tab) #debug code
temp<-RODBC::odbcConnectAccess2007(xWalk_path)%>%
RODBC::sqlFetch(., tab, stringsAsFactors = FALSE)
return(temp)
odbcCloseAll()
})
#error every time
The above code will return the error:
Warning in odbcDriverConnect(con, ...) :
[RODBC] ERROR: Could not SQLDriverConnect
Warning in odbcDriverConnect(con, ...) : ODBC connection failed
Error in RODBC::sqlFetch(., tab, stringsAsFactors = FALSE) :
first argument is not an open RODBC channel
I am baffled. I accessed the db to pull table names and generate the xtabs variable using sql Tables. Also, earlier in my code I used a similar code structure (not identical, but same core: sqlFetch to retrieve a table into a list) nd it worked without a problem. Only difference between then and now is that: Then I was opening and closing different .accdb files, but pulling the same table name from each. Now, I am opening and closing the same .accdb file but pulling different sheet names each time.
Am I somehow opening and closing this too fast and it is getting irritated with me? That seems unlikely, because if I force it to print(tab) as the first line of the function it will only print the first table name. If it was getting annoyed about the speed of opening an closing I would expect it to print 2 table names before throwing the error.
return returns its argument and exits, so the remaining code (odbcCloseAll()) won't be executed and the opened file (AccessDB) remains locked as you supposed.

R tryCatch RODBC function issue

We have a number of MS Access databases on a server which are copies from remote locations which are updated overnight. We collate some of the data from these machines for reporting purposes on a daily basis. Sometimes the overnight update fails, meaning we don’t have access to all of the databases, so I am attempting to write an R script which will test if we can connect (using a list of the database paths), and output an updated version of the list including only those which we can connect to. This will then be used to run a further script which will only update the data related to the available databases.
This is what I have so far (I am new to R but reasonably proficient in SAS and SQL – attempting to use R both as a learning exercise and for potential cost savings);
{
# Create Store data locations listing
A=matrix(c(1000,1,"One","//Server/Comms1/Access.mdb"
,2000,2,"Two","//Server/Comms2/Access.mdb"
,3000,3,"Three","//Server/Comms3/Access.mdb"
)
,nrow=3,ncol=4,byrow=TRUE)
# Add column names
colnames(A)<-c("Ref1","Ref2","Ref3","Location")
#Create summary for testing connections (Ref1 and Location)
B<-A[,c(1,4)]
ConnectionTest<-function(Ref1,Location)
{
out<-tryCatch({ch<-odbcDriverConnect(paste("Driver={Microsoft Access Driver (*.mdb, *.accdb)};DBQ=",Location))
sqlQuery(ch,paste("select ",Ref1," as Ref1,COUNT(variable) as Count from table"))}
,error=matrix(c(Ref1,0),nrow=1,ncol=2,byrow=TRUE)
)
return(out)
}
#Run function, using 'B' to provide arguments
C<-apply(B,1,function(x)do.call(ConnectionTest,as.list(x)))
#Convert to matrix and add column names
D<-matrix(unlist(C),ncol=2,byrow=T)
colnames(D)<-c("Ref1","Count")
}
When I run the script I get the following error message;
Error in value[3L] : attempt to apply non-function
I am guessing this is because I am using TryCatch incorrectly inside the UDF?
Does anyone have any advice on what I am doing incorrectly, or even if this is the best way to do what I am attempting?
Thanks
(apologies if this is formatted incorrectly, having to post on my phone due to Stackoverflow posting being blocked)
Edit - I think I fixed the 'Error in value[3L]' issue by adding function(e) {} around the matrix function in the error part of the tryCatch.
The issue now is that the script just fails if it can't reach one of the databases, rather than doing the matrix function. Do I need to add something else to make it ignore the error?
Edit 2 - it seems tryCatch does now work - it processes the
alternate function upon error but also shows warnings about the error, which makes sense.
As mentioned in the edit above, using 'function(e) {}' to wrap the Matrix function in the error section of the tryCatch fixed the 'Error in value[3L]' issue, so the script now works, but displays error messages if it can't access a particular channel. I am guessing the 'warning' section of the tryCatch can be used to adjust these as necessary.

Avoiding warning message “There is a result object still in use” when using dbSendQuery to create table on database

Background:
I use dbplyr and dplyr to extract data from a database, then I use the command dbSendQuery() to build my table.
Issue:
After the table is built, if I run another command I get the following warning:
Warning messages:
1. In new_result(connection#ptr, statement): Cancelling previous query
2. In connection_release(conn#ptr) :
 There is a result object still in use.
The connection will be automatically released when it is closed.
Question:
Because I don’t have a result to fetch (I am sending a command to build a table) I’m not sure how to avoid this warning. At the moment I disconnect after building a table and the error goes away. Is there anything I can do do to avoid this warning?
Currently everything works, I just have this warning. I'd just like to avoid it as I assume I should be clearing something after I've built my table.
Code sample
# establish connection
con = DBI::dbConnect(<connection stuff here>)
# connect to table and database
transactions = tbl(con,in_schema(“DATABASE_NAME”,”TABLE_NAME”))
# build query string
query_string = “SELECT * FROM some_table”
# drop current version of table
DBI::dbSendQuery(con,paste('DROP TABLE MY_DB.MY_TABLE'))
# build new version of table
DBI::dbSendQuery(con,paste('CREATE TABLE PABLE MY_DB.MY_TABLE AS (‘,query_string,’) WITH DATA'))
Even though you're not retrieving stuff with a SELECT clause, DBI still allocates a result set after every call to DBI::dbSendQuery().
Give it a try with DBI::dbClearResult() in between of DBI::dbSendQuery() calls.
DBI::dbClearResult() does:
Clear A Result Set
Frees all resources (local and remote) associated with a
result set. In some cases (e.g., very large result sets) this
can be a critical step to avoid exhausting resources
(memory, file descriptors, etc.)
The example of the man page should give a hint how the function should be called:
con <- dbConnect(RSQLite::SQLite(), ":memory:")
rs <- dbSendQuery(con, "SELECT 1")
print(dbFetch(rs))
dbClearResult(rs)
dbDisconnect(con)

Executing a stored oracle procedure in R using ROracle

I'm having trouble executing/calling an Oracle procedure in R via ROracle. I've tried many different ways of calling the procedure and I keep getting the same errors.
I've had no problem doing SELECT queries but calling a procedure is proving difficult. I've used both oracleProc and dbSendQuery functions, but to no avail. Neither of them work. Roracle documentation is pathetic for examples of calling procedures.
Let's say the Oracle procedure is called MYPROC in MYSCHEMA. The procedure is very simple with NO parameters (it involves reading a few tables and writing to a table)
When I execute the procedure directly in Oracle Developer, there is no problem:
The following works in Oracle Developer (but not in R)
EXEC MYSCHEMA.MYPROC;
Then I try to call the same procedure from R (via ROracle) and gives me error. I've tried many different ways of calling the procedure i get same errors:
# This didn't work in R
> require(ROracle)
> LOAD_query <- oracleProc(con1, "BEGIN EXEC MYSCHEMA.MYPROC; END;")
This is the error I get:
Error in .oci.oracleProc(conn, statement, data = data, prefetch =
prefetch, :
# Then i tried the following and it still didn't work
> LOAD_query <- oracleProc(con1, "EXEC MYSCHEMA.MYPROC;")
This is the error i got (a bit different from the one above):
Error in .oci.oracleProc(conn, statement, data = data, prefetch =
prefetch, : ORA-00900: invalid SQL statement
# so then i tried dbSendQuery which works perfectly fine with any SELECT statements but it didn't work
> LOAD_query <- dbSendQuery(con1, "BEGIN EXEC MYSCHEMA.MYPROC; END;")
This is the error i get (same as the first one):
Error in .oci.SendQuery(conn, statement, data = data, prefetch =
prefetch, :
# I even tried the following to exhaust all possibilities. And still no luck. I get the same error as above:
> LOAD_query <- oracleProc(con1, "BEGIN EXEC MYSCHEMA.MYPROC(); END;")
My procedure doesn't have any parameters. As I mentioned it works just fine when called in Oracle developer.
I've run out of ideas how to get such a ridiculously simple query work in R! I am only interested in getting this work via ROracle though.
Did you create (compile) the procedure first? For example:
dbGetQuery(con, "CREATE PROCEDURE MYPROC ... ")
Then try to execute the procedure like this:
oracleProc(con, "BEGIN MYPROC(); END;")
You're right that ROracle::oracleProc documentation is not good. This example helped me:
https://community.oracle.com/thread/4058424

RImpala: Query Failed When Larger Data

check1<-rimpala.query("select * from sum2")
Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, :
java.sql.SQLException: Method not supported
dim(sum2) is 49501 rows and 18 columns.
check1<-rimpala.query("select *from sum3")
dim(sum3) is 102 rows and 6 columns.
It worked with smaller sample size.
sorry that I cant reproduce example to this. Is anyone encounter the same problem with larger data size? Any idea to solve this? Thanks.
As noted elsewhere on StackOverflow, RImpala does not implement executeUpdate and so cannot run any query that modifies state. I suspect you hit your error not by running a larger SELECT query but rather because you tried to insert, update, or delete some data.
If you'd like to use Impala from R, I'd recommend using dplyrimpaladb.
RImpala (v0.1.6) build is updated with the support to execute DDL queries using executeUpdate.
The latest build contains the following fixes / additions:
Support for DDL query execution.
fetchSize parameter in query function to state the number of records that can be retrieved in one round trip read from Impala.
Fix for query failing when NULL values are being returned.
Compatiblity with CDH 5.x.x
You can run DDL queries using the query function as illustrated below:
rimpala.query(Q="drop table sample_table",isDDL="true")
You can also specify the fetchSize in the query function to aid reading large data efficiently.
rimpala.query(Q="select * from sample_table",fetchSize="10000")
Please find the latest build in Cran : http://cran.r-project.org/web/packages/RImpala/index.html
Source Code : https://github.com/Mu-Sigma/RImpala
I have the same problem with the RImpala package and recommend to use the RJDBC package:
library(RJDBC)
drv <- JDBC(driverClass = "org.apache.hive.jdbc.HiveDriver",
classPath = list.files("path_to_jars",pattern="jar$",full.names=T),
identifier.quote="`")
conn <- dbConnect(drv, "jdbc:hive2://localhost:21050/;auth=noSasl")
check1 <- dbGetQuery(conn, "select *from sum3")
I used these jar files an evenything works as expected:
https://downloads.cloudera.com/impala-jdbc/impala-jdbc-0.5-2.zip
For more information and a speed comparison look at this blog post:
http://datascience.la/r-and-impala-its-better-to-kiss-than-using-java/

Resources