Executing a stored oracle procedure in R using ROracle - r

I'm having trouble executing/calling an Oracle procedure in R via ROracle. I've tried many different ways of calling the procedure and I keep getting the same errors.
I've had no problem doing SELECT queries but calling a procedure is proving difficult. I've used both oracleProc and dbSendQuery functions, but to no avail. Neither of them work. Roracle documentation is pathetic for examples of calling procedures.
Let's say the Oracle procedure is called MYPROC in MYSCHEMA. The procedure is very simple with NO parameters (it involves reading a few tables and writing to a table)
When I execute the procedure directly in Oracle Developer, there is no problem:
The following works in Oracle Developer (but not in R)
EXEC MYSCHEMA.MYPROC;
Then I try to call the same procedure from R (via ROracle) and gives me error. I've tried many different ways of calling the procedure i get same errors:
# This didn't work in R
> require(ROracle)
> LOAD_query <- oracleProc(con1, "BEGIN EXEC MYSCHEMA.MYPROC; END;")
This is the error I get:
Error in .oci.oracleProc(conn, statement, data = data, prefetch =
prefetch, :
# Then i tried the following and it still didn't work
> LOAD_query <- oracleProc(con1, "EXEC MYSCHEMA.MYPROC;")
This is the error i got (a bit different from the one above):
Error in .oci.oracleProc(conn, statement, data = data, prefetch =
prefetch, : ORA-00900: invalid SQL statement
# so then i tried dbSendQuery which works perfectly fine with any SELECT statements but it didn't work
> LOAD_query <- dbSendQuery(con1, "BEGIN EXEC MYSCHEMA.MYPROC; END;")
This is the error i get (same as the first one):
Error in .oci.SendQuery(conn, statement, data = data, prefetch =
prefetch, :
# I even tried the following to exhaust all possibilities. And still no luck. I get the same error as above:
> LOAD_query <- oracleProc(con1, "BEGIN EXEC MYSCHEMA.MYPROC(); END;")
My procedure doesn't have any parameters. As I mentioned it works just fine when called in Oracle developer.
I've run out of ideas how to get such a ridiculously simple query work in R! I am only interested in getting this work via ROracle though.

Did you create (compile) the procedure first? For example:
dbGetQuery(con, "CREATE PROCEDURE MYPROC ... ")
Then try to execute the procedure like this:
oracleProc(con, "BEGIN MYPROC(); END;")
You're right that ROracle::oracleProc documentation is not good. This example helped me:
https://community.oracle.com/thread/4058424

Related

R stop function on multiple objects

I have a simple script that has 10 sqlquery and sqlSave functions.
I am trying to create a sql table to capture when an error or warning occurs.
Example I can write
print("Sql QueryA loading!") into a table but in my case sql.
then the sqlquery/sqlsave will run and then i write another print"Sql QueryA Saved"
I can use
if(NROW(mysqlquery)!=0{sqlqA= sqlSave(serverdb,"tablename"and so on)}
else{stop(sqlSave("my error table" and so on)}
But i would have to do this for every object, I dont understand how I can do this for all my objects for example if NROWS SqlA >0 then execute sqlsave if NROWS SqlB =0 the stop any code below and write the error to the error table.
This is probably a duplicate but looking at other questions i cant understand how i can get this to work. thanks to anyone that might be able to help.

Avoiding warning message “There is a result object still in use” when using dbSendQuery to create table on database

Background:
I use dbplyr and dplyr to extract data from a database, then I use the command dbSendQuery() to build my table.
Issue:
After the table is built, if I run another command I get the following warning:
Warning messages:
1. In new_result(connection#ptr, statement): Cancelling previous query
2. In connection_release(conn#ptr) :
 There is a result object still in use.
The connection will be automatically released when it is closed.
Question:
Because I don’t have a result to fetch (I am sending a command to build a table) I’m not sure how to avoid this warning. At the moment I disconnect after building a table and the error goes away. Is there anything I can do do to avoid this warning?
Currently everything works, I just have this warning. I'd just like to avoid it as I assume I should be clearing something after I've built my table.
Code sample
# establish connection
con = DBI::dbConnect(<connection stuff here>)
# connect to table and database
transactions = tbl(con,in_schema(“DATABASE_NAME”,”TABLE_NAME”))
# build query string
query_string = “SELECT * FROM some_table”
# drop current version of table
DBI::dbSendQuery(con,paste('DROP TABLE MY_DB.MY_TABLE'))
# build new version of table
DBI::dbSendQuery(con,paste('CREATE TABLE PABLE MY_DB.MY_TABLE AS (‘,query_string,’) WITH DATA'))
Even though you're not retrieving stuff with a SELECT clause, DBI still allocates a result set after every call to DBI::dbSendQuery().
Give it a try with DBI::dbClearResult() in between of DBI::dbSendQuery() calls.
DBI::dbClearResult() does:
Clear A Result Set
Frees all resources (local and remote) associated with a
result set. In some cases (e.g., very large result sets) this
can be a critical step to avoid exhausting resources
(memory, file descriptors, etc.)
The example of the man page should give a hint how the function should be called:
con <- dbConnect(RSQLite::SQLite(), ":memory:")
rs <- dbSendQuery(con, "SELECT 1")
print(dbFetch(rs))
dbClearResult(rs)
dbDisconnect(con)

How can I unserialize a model object using PL/R in Greenplum/Postgres?

Error unserializing model object in Greenplum via PL/R
I store model objects in a greenplum database (the open source version) and I've successfully been able to serialize my model objects, insert them into a table in greenplum and unserialize when needed, but using R version 3.5 installed on my machine (local). This is the R code below that runs successfully:
Code:
fromtable = 'modelObjDevelopment'
mod.id = '7919'
model_obj <-
dbGetQuery(conn,
sprintf("SELECT val from standard.%s where model_id::int = '%s';",
fromtable, mod.id))
iter_model <- postgresqlUnescapeBytea(model_obj)
lm_obj_back <- unserialize(iter_model)
summary(lm_obj_back)
Recently, I have installed PL/R on greenplum with all the necessary libraries that I generally use. I am attempting to recreate the code I use in local R (mentioned above) to run on greenplum. After much research I have been trying to run the following transformed code, which relentlessly keeps failing and giving me the same error.
Code:
DROP FUNCTION IF EXISTS mdl_load(val bytea);
CREATE FUNCTION mdl_load(val bytea)
RETURNS text AS
$$
require("RPostgreSQL")
iter_model<-postgresqlUnescapeBytea(val)
model<-unserialize(iter_model)
return(length(val))
$$
LANGUAGE 'plr';
select length(val::bytea) as len, mdl_load(val) as t
from modelObjDevelopment
where model_id::int = 7919
At this point I don't care what I return, I just want the unserialize function to work.
Error:
[22000] ERROR: R interpreter expression evaluation error Detail: Error in unserialize(iter_model) : unknown input format Where: In PL/R function mdl_load
Hope someone had a similar issue and might have a clue for me. It seems that the bytea object changes size after being passed into Pl/R. I am new to this method and hope someone can help.
$$
require(RPostgreSQL)
## load the PostgresSQL driver
drv <- dbDriver("PostgreSQL")
## connect to the default db
con <- dbConnect(drv, dbname = 'XXX')
rows<-dbGetQuery(con, 'SELECT encode(val::bytea,'escape') from standard.modelObjDevelopment where model_id::int=1234')
iter_model<-postgresqlUnescapeBytea(rows[[model_obj_column]])
model<-unserialize(iter_model)
$$
We solved this problem together. For future people coming to this site, get and unserialize model object inside R code is the way to go.

Error: unexpected symbol in RScript - No further information provided about the line or syntax generating error

I have read the many posts related to R Syntax errors, but everyone points to the error message and using it to figure out where the error occurs. My situation is different in that the error is generic. See below:
Error: unexpected symbol in "RScript correlation_presalesfinal3.R"
RStudio executes it fine.
It is an incredibly simple script, and I am wondering if it has to do with how I am constructing my Postgres syntax. Does R require line break symbols between the statements (select, from, group by etc)?
That is the only thing I can thing of. I am trying to compare a separate R-generated correlation to one generated by PostgreSQL directly. This particular piece is the call to PostgreSQL to calculate correlation directly.
I appreciate your help!
Here is the code:
#Written by Laura for Standard Imp
#Install if necessary (definitely on the first run)
install.packages("RColorBrewer")
install.packages("gplots")
install.packages("RSclient")
install.packages("RPostgreSQL")
#libraries in use
library(RColorBrewer)
library(gplots)
library(RSclient)
library(RPostgreSQL)
# Establish connection to PostgreSQL using RPostgreSQL
drv <- dbDriver("PostgreSQL")
# Full version of connection setting
con <- dbConnect(drv, dbname="db",host="ip",port=5432,user="user",password="pwd")
# -----------------------------^--------^-------------------^---- -------^
myLHSRHSFinalTable <- dbGetQuery(con,"select l1.a_lhsdescription as LHS, l2.a_rhsdescription as RHS, l7.a_scenariodescription as Scenario, corr(l3.driver_metric, l4.driver_metric) as Amount from schema_name.table_name l3 join schema_name.table_name l4 on L3.Time_ID = l4.Time_ID join schema_name.opera_00004_dim_lhs l1 on l3.LHS_ID = l1.member_id join schema_name.opera_00004_dim_rhs l2 on l4.RHS_ID = l2.member_id join schema_name.opera_00004_dim_scenario l7 on l3.scenario_id = l7.member_id join schema_name.opera_00004_dim_time l8 on l3.time_id = l8.member_id where l7.a_scenariodescription = 'Actual'
group by l1.a_lhsdescription , l2.a_rhsdescription, l7.a_scenariodescription ")
myLHSRHSFinalTable
write.csv(myLHSRHSFinalTable, file = "data_load_stats_final.csv")
# Close PostgreSQL connection
dbDisconnect(con)
Your description of the problem possibly lacks enough detail for people to answer, but in my situation I ran into this same error message because I was executing Rscript from within the R shell. The documentation in R isn't always clear, as neither is the help, for indicating to the user where the commands are to be executed.
If you're working from the 'terminal' you use Rscript to run an R script, whereas if you're working from within the 'R shell' you use source() to run an R script.
As I'm still a newbie, I'm sure this answer is too much of an oversimplification, but it works to solve the basic error I was getting.
My script file called output.R can be executed from the 'terminal' command line prompt ("$") within my Linux system by the command:
$Rscript output.R
Or alternatively from within R, by first running the R shell, then executing the command at the R prompt (">")
$R
>source("output.R")

RImpala: Query Failed When Larger Data

check1<-rimpala.query("select * from sum2")
Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, :
java.sql.SQLException: Method not supported
dim(sum2) is 49501 rows and 18 columns.
check1<-rimpala.query("select *from sum3")
dim(sum3) is 102 rows and 6 columns.
It worked with smaller sample size.
sorry that I cant reproduce example to this. Is anyone encounter the same problem with larger data size? Any idea to solve this? Thanks.
As noted elsewhere on StackOverflow, RImpala does not implement executeUpdate and so cannot run any query that modifies state. I suspect you hit your error not by running a larger SELECT query but rather because you tried to insert, update, or delete some data.
If you'd like to use Impala from R, I'd recommend using dplyrimpaladb.
RImpala (v0.1.6) build is updated with the support to execute DDL queries using executeUpdate.
The latest build contains the following fixes / additions:
Support for DDL query execution.
fetchSize parameter in query function to state the number of records that can be retrieved in one round trip read from Impala.
Fix for query failing when NULL values are being returned.
Compatiblity with CDH 5.x.x
You can run DDL queries using the query function as illustrated below:
rimpala.query(Q="drop table sample_table",isDDL="true")
You can also specify the fetchSize in the query function to aid reading large data efficiently.
rimpala.query(Q="select * from sample_table",fetchSize="10000")
Please find the latest build in Cran : http://cran.r-project.org/web/packages/RImpala/index.html
Source Code : https://github.com/Mu-Sigma/RImpala
I have the same problem with the RImpala package and recommend to use the RJDBC package:
library(RJDBC)
drv <- JDBC(driverClass = "org.apache.hive.jdbc.HiveDriver",
classPath = list.files("path_to_jars",pattern="jar$",full.names=T),
identifier.quote="`")
conn <- dbConnect(drv, "jdbc:hive2://localhost:21050/;auth=noSasl")
check1 <- dbGetQuery(conn, "select *from sum3")
I used these jar files an evenything works as expected:
https://downloads.cloudera.com/impala-jdbc/impala-jdbc-0.5-2.zip
For more information and a speed comparison look at this blog post:
http://datascience.la/r-and-impala-its-better-to-kiss-than-using-java/

Resources