Debugging the R script I have come across a strange error: “Error in debug(fun, text, condition) : argument must be a closure”.
PC features: Win7/64 bit, Oracle client 12 (both 32 and 64bit), R (64bit)
Earlier the script has been debugged well without errors. I have looked for a clue in the Inet but have found no clear explanation what the mistake is and how to remove it.
Running the script as a plain script but not a function produces no errors.
I would be very grateful for your ideas
The source script (connection to oracle DB and executing a simple query)as follows (conects to Oracle DB and execute the query:
download1<-function(){
if (require("dplyr")){
#install.packages("dplyr")
}
if (require("RODBC")){
#install.packages("RODBC")
}
library(RODBC)
library(dplyr)
# to establish connection with DB or schema
con <- odbcConnect("DB", uid="ANALYTICS", pwd="122334fgcx", rows_at_time = 500,believeNRows=FALSE)
# Check that connection is working (Optional)
odbcGetInfo(con)
# Query the database and put the results into the data frame "dataframe"
ptm <- proc.time()
x<-sqlQuery(con, "select * from my_table")
proc.time()-ptm
# to extract all field names to the separate vector
#field_names<-sqlQuery(con,"SELECT column_name FROM all_tab_cols WHERE table_name = 'MY_TABLE'")
close(con)
}
debug(download1(),text = "", condition = NULL)
Use
debug(download1)
download1()
Related
I have searched high and low for answers so apologies if it has already been answered!
Using R I am trying to perform a lazy evaluation of Oracle 11.1 databases. I have used JDBC to facilitate the connection and I can confirm it works fine. I am also able to query tables using dbGetQuery, although the results are so large that I quickly run out of memory.
I have tried dbplyr/dplyr tbl(con, "ORACLE_TABLE") although I get the following error:
Error in .verify.JDBC.result(r, "Unable to retrieve JDBC result set for ", :
Unable to retrieve JDBC result set for SELECT *
FROM "ORACLE_TABLE" AS "zzz39"
WHERE (0 = 1) (ORA-00933: SQL command not properly ended)
I have also tried using db_table <- tbl(con, in_schema('ORACLE_TABLE'))
This is happening with all databases I am connected to, despite being able to perform a regular dbGetQuery.
Full Code:
# Libraries
library(odbc)
library(DBI)
library(config)
library(RJDBC)
library(dplyr)
library(tidyr)
library(magrittr)
library(stringr)
library(xlsx)
library(RSQLite)
library(dbplyr)
Oracle Connection
db <- config::get('db')
drv1 <- JDBC(driverClass=db$driverClass, classPath=db$classPath)
con_db <- dbConnect(drv1, db$connStr, db$orauser, db$orapw, trusted_connection = TRUE)
# Query (This one works but the data set is too large)
db_data <- dbSendQuery(con_db, "SELECT end_dte, reference, id_number FROM ORACLE_TABLE where end_dte > '01JAN2019'")
**# Query (this one wont work)**
oracle_table <- tbl(con_db, "ORACLE_TABLE")
Solved:
Updated Rstudio + Packages.
Follow this manual:
https://www.linkedin.com/pulse/connect-oracle-database-r-rjdbc-tianwei-zhang/
Insert the following code after 'con':
sql_translate_env.JDBCConnection <- dbplyr:::sql_translate_env.Oracle
sql_select.JDBCConnection <- dbplyr:::sql_select.Oracle
sql_subquery.JDBCConnection <- dbplyr:::sql_subquery.Oracle
Background:
I use dbplyr and dplyr to extract data from a database, then I use the command dbSendQuery() to build my table.
Issue:
After the table is built, if I run another command I get the following warning:
Warning messages:
1. In new_result(connection#ptr, statement): Cancelling previous query
2. In connection_release(conn#ptr) :
There is a result object still in use.
The connection will be automatically released when it is closed.
Question:
Because I don’t have a result to fetch (I am sending a command to build a table) I’m not sure how to avoid this warning. At the moment I disconnect after building a table and the error goes away. Is there anything I can do do to avoid this warning?
Currently everything works, I just have this warning. I'd just like to avoid it as I assume I should be clearing something after I've built my table.
Code sample
# establish connection
con = DBI::dbConnect(<connection stuff here>)
# connect to table and database
transactions = tbl(con,in_schema(“DATABASE_NAME”,”TABLE_NAME”))
# build query string
query_string = “SELECT * FROM some_table”
# drop current version of table
DBI::dbSendQuery(con,paste('DROP TABLE MY_DB.MY_TABLE'))
# build new version of table
DBI::dbSendQuery(con,paste('CREATE TABLE PABLE MY_DB.MY_TABLE AS (‘,query_string,’) WITH DATA'))
Even though you're not retrieving stuff with a SELECT clause, DBI still allocates a result set after every call to DBI::dbSendQuery().
Give it a try with DBI::dbClearResult() in between of DBI::dbSendQuery() calls.
DBI::dbClearResult() does:
Clear A Result Set
Frees all resources (local and remote) associated with a
result set. In some cases (e.g., very large result sets) this
can be a critical step to avoid exhausting resources
(memory, file descriptors, etc.)
The example of the man page should give a hint how the function should be called:
con <- dbConnect(RSQLite::SQLite(), ":memory:")
rs <- dbSendQuery(con, "SELECT 1")
print(dbFetch(rs))
dbClearResult(rs)
dbDisconnect(con)
Error unserializing model object in Greenplum via PL/R
I store model objects in a greenplum database (the open source version) and I've successfully been able to serialize my model objects, insert them into a table in greenplum and unserialize when needed, but using R version 3.5 installed on my machine (local). This is the R code below that runs successfully:
Code:
fromtable = 'modelObjDevelopment'
mod.id = '7919'
model_obj <-
dbGetQuery(conn,
sprintf("SELECT val from standard.%s where model_id::int = '%s';",
fromtable, mod.id))
iter_model <- postgresqlUnescapeBytea(model_obj)
lm_obj_back <- unserialize(iter_model)
summary(lm_obj_back)
Recently, I have installed PL/R on greenplum with all the necessary libraries that I generally use. I am attempting to recreate the code I use in local R (mentioned above) to run on greenplum. After much research I have been trying to run the following transformed code, which relentlessly keeps failing and giving me the same error.
Code:
DROP FUNCTION IF EXISTS mdl_load(val bytea);
CREATE FUNCTION mdl_load(val bytea)
RETURNS text AS
$$
require("RPostgreSQL")
iter_model<-postgresqlUnescapeBytea(val)
model<-unserialize(iter_model)
return(length(val))
$$
LANGUAGE 'plr';
select length(val::bytea) as len, mdl_load(val) as t
from modelObjDevelopment
where model_id::int = 7919
At this point I don't care what I return, I just want the unserialize function to work.
Error:
[22000] ERROR: R interpreter expression evaluation error Detail: Error in unserialize(iter_model) : unknown input format Where: In PL/R function mdl_load
Hope someone had a similar issue and might have a clue for me. It seems that the bytea object changes size after being passed into Pl/R. I am new to this method and hope someone can help.
$$
require(RPostgreSQL)
## load the PostgresSQL driver
drv <- dbDriver("PostgreSQL")
## connect to the default db
con <- dbConnect(drv, dbname = 'XXX')
rows<-dbGetQuery(con, 'SELECT encode(val::bytea,'escape') from standard.modelObjDevelopment where model_id::int=1234')
iter_model<-postgresqlUnescapeBytea(rows[[model_obj_column]])
model<-unserialize(iter_model)
$$
We solved this problem together. For future people coming to this site, get and unserialize model object inside R code is the way to go.
I have a really large table (8M rows) that I need to import in R on which I will be doing some processing. Problem is when I try to bring it in R using the DBI package I get an error
My code is below
options(java.parameters = "-Xmx8048m")
psql.jdbc.driver <- "../postgresql-42.2.1.jar"
jdbc.url <- "jdbc:postgresql://server_url:port"
pgsql <- JDBC("org.postgresql.Driver", psql.jdbc.driver, "`")
con <- dbConnect(pgsql, jdbc.url, user="", password= '')
tbl <- dbGetQuery(con, "SELECT * FROM my_table;")
And the error I get is
Error in .verify.JDBC.result(r, "Unable to retrieve JDBC result set for ", :
Unable to retrieve JDBC result set for SELECT * FROM my_table; (Ran out of memory retrieving query results.)
I can understand its because the result set is too big but I am not sure how to retrieve it by batches instead of all of it together. I have tried using dBSendQuery, dbReadTable and dbGetQuery all of them give the same error.
Any help would be appreciated!
I got it to work by using the RPostgreSQL package instead of the default RJDBC and DBI package.
It was able to do a sendQuery and then used fetch recursively to get the data in chunks of 10,000.
main_tbl <- dbFetch(postgres_query, n=-1) #didnt work so tried in chunks
df<- data.frame()
while (!dbHasCompleted(postgres_query)) {
chunk <- dbFetch(postgres_query, 10000)
print(nrow(chunk))
df = rbind(df, chunk)
}
After I use
cn<-odbcConnect(...)
to connect to MS SQL Server. I can successfully get data using:
tmp <- sqlQuery(cn, "select * from MyTable")
But if I use
tmp <- sqlFetch(cn,"MyTable")
R would complain about "Error in odbcTableExists(channel, sqtable) : table not found on channel". Did I miss anything here?
Assuming you work on Windows OS. When you define your "dsn" in Control panel > Administrative tools > System and Security > Data Sources (ODBC), you have to select a database as well. If you do that your code should work as expected.
So, the problem is not in your R code, but in your "dsn" string that in my opinion does not contain the reference to a database which is needed.