Errors while passing character parameters to a R UDF in Vertica - r

I've asked vertica support for this as well but wondering if any1 here got the same issue.
I'm working with Vertica Analytic Database v6.1.3-0
I'm using R version 3.0.0 - which comes with the Vertica R language pack.
I'm trying to create a simple UDF that uses a parameter passed with the USING PARAMETERS keyword.
This is the R code :
testFun <- function(x,y) {
# extract the function parameters from y
parameter <- y[["parameter"]] # parameter to be passed
sum(x[,1])
}
testFunParams <- function()
{
params <- data.frame(datatype=rep(NA, 1), length=rep(NA,1),scale=rep(NA,1),name=rep(NA,1))
params[1,1] <- "varchar"
params[1,2] <- "40"
params[1,4] <- "parameter"
params
}
testFunFactory <- function()
{
list(
name=testFun
,udxtype=c("transform")
,intype=c("int")
,outtype=c("varchar(200)")
,outnames=c('test')
,parametertypecallback=testFunParams
,volatility=c("stable")
,strict=c("called_on_null_input")
)
}
In Vertica I run the library :
drop library r_test cascade;
create or replace library r_test as '.../testFun.r' language 'R';
create transform function testFun as name 'testFunFactory' library r_test;
create table test as select 1 as x union select 2 union select 3 union select 4 union select 5 union select 6 union select 7;
select testFun(x) over() from test;
> ERROR 3399: Failure in UDx RPC call InvokeGetParameterType(): Error calling getParameterType() in User Defined Object [testFun] at [/scratch_a/release/vbuild/vertica/UDxFence/vertica-udx-R.cpp:245], error code: 0, message: Error happened in getParameterType : not compatible with REALSXP
I've tried Vertica's example for a function with parameters and it worked, when I changed the parameter type to varchar it failed.
What can be done?
Thanks

I tested your configuration and it returned the following error
[...] Error happened in getParameterType : not compatible with REALSXP [...]
And after some tweaking I know what happened. You saved "scale" as a character value instead of a numeric in your "testFunParams".
Test if that helps you =)

Related

R, ClickHouse: Expected: FixedString(34). Got: UInt64: While processing

I am trying to query data from ClickHouse database from R with subset.
Here is the example
library(data.table)
library(RClickhouse)
library(DBI)
subset <- paste(traffic[,unique(IDs)][1:30], collapse = ',')
conClickHouse <- DBI::dbConnect('here is the connection')
DataX <- dbgetdbGetQuery(conClickHouse, paste0("select * from database
and IDs in (", subset ,") ", sep = "") )
As a result I get error:
DB::Exception: Type mismatch in IN or VALUES section. Expected: FixedString(34).
Got: UInt64: While processing (IDs IN ....
Any help is appreciated
Thanks to the comment of #DennyCrane,
"select * from database where toFixedString(IDs,34) in
(toFixedString(ID1, 34), toFixedString(ID2,34 ))"
This query subset properly
https://clickhouse.tech/docs/en/sql-reference/functions/#strong-typing
Strong Typing
In contrast to standard SQL, ClickHouse has strong typing. In other words, it doesn’t make implicit conversions between types. Each function works for a specific set of types. This means that sometimes you need to use type conversion functions.
https://clickhouse.tech/docs/en/sql-reference/functions/type-conversion-functions/#tofixedstrings-n
select * from (select 'x' B ) where B in (select toFixedString('x',1))
DB::Exception: Types of column 1 in section IN don't match: String on the left, FixedString(1) on the right.
use casting toString or toFixedString
select * from (select 'x' B ) where toFixedString(B,1) in (select toFixedString('x',1))

Inserting a R dataframe in SQL table using a stored proc

I have a dataframe in R containing 10 rows and 7 columns. There's a stored procedure that does the few logic checks in the background and then inserts the data in the table 'commodity_price'.
library(RMySQL)
#Connection Settings
mydb = dbConnect(MySQL(),
user='uid',
password='pwd',
dbname='database_name',
host='localhost')
#Listing the tables
dbListTables(mydb)
f= data.frame(
location= rep('Bhubaneshwar', 4),
sourceid= c(8,8,9,2),
product= c("Ingot", "Ingot", "Sow Ingot", "Alloy Ingot"),
Specification = c('ie10','ic20','se07','se08'),
Price=c(14668,14200,14280,20980),
currency=rep('INR',4),
uom=rep('INR/MT',4)
)
For multiple rows insert, there's a pre-created stored proc 'PROC_COMMODITY_PRICE_INSERT', which I need to call.
for (i in 1:nrow(f))
{
dbGetQuery(mydb,"CALL PROC_COMMODITY_PRICE_INSERT(
paste(f$location[i],',',
f$sourceid[i],',',f$product[i],',',f$Specification[i],',',
f$Price[i],',',f$currency[i],',', f$uom[i],',',#xyz,')',sep='')
);")
}
I am repeatedly getting error.
Error in .local(conn, statement, ...) :
could not run statement: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '[i],',',
f$sourceid[i],',',f$product[i],',',f$Specification' at line 2
I tried using RODBC but its not getting connected at all. How can I insert the data from the R dataframe in the 'commodity_price' table by calling a stored proc? Thanks in advance!
That is probably due to your use of ', this might work:
for (i in 1:nrow(f))
{
dbGetQuery(mydb,paste("CALL PROC_COMMODITY_PRICE_INSERT(",f$location[i],',',
f$sourceid[i],',',f$product[i],',',f$Specification[i],',',
f$Price[i],',',f$currency[i],',', f$uom[i],',',"#xyz",sep='',");"))
}
or the one-liner:
dbGetQuery(mydb,paste0("CALL PROC_COMMODITY_PRICE_INSERT('",apply(f, 1, paste0, collapse = "', '"),"');"))
Trying the for loop:
for (i in 1:nrow(f))
{
dbGetQuery(mydb,paste("CALL PROC_COMMODITY_PRICE_INSERT(","'",f$location[i],"'",',',"'",
f$sourceid[i],"'",',',"'",f$product[i],"'",',',"'",f$Specification[i],"'",',',"'",
f$Price[i],"'",',',"'",f$currency[i],"'",',',"'",f$uom[i],"'",',','#xyz',sep='',");"))
}

RPostgreSQL and DBI: "operator does not exist: uuid = text"

When using dbReadTable to read in database tables that uses UUID as the primary key, I get the following warning message.
1: In postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver warning: (unrecognized PostgreSQL field type uuid (id:2950) in column 0)
When I modify the table I loaded and try to update the database using, I get the following error message:
Error in postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver: (could not Retrieve the result : ERROR: operator does not exist: uuid = text
I get that the UUID type is not available in R, but is there a way that we can make the database believe the character vector "unique_id" is UUID instead of text?
Code:
library(RPostgreSQL)
library(postGIStools)
pgdrv <- dbDriver(drvName = "PostgreSQL")
# === open connection
db <- DBI::dbConnect(pgdrv,
dbname="database",
host="localhost", port=5432,
user = 'postgres')
# === get tables
users <- dbReadTable(db, "app_users")
# === interaction with tables
users$employee_has_quit[1:5] <- TRUE
# === update tables
postgis_update(conn = db,
df = users,
tbl = "app_users",
id_cols = "unique_id",
update_cols = "employee_has_quit")
# === close conncetion
DBI::dbDisconnect(db)
The problem is a bug in postGIStools. You can see the code they're using to generate this error here
query_text <- paste(query_text, ") AS", tbl_tmp, "(",
paste(quote_id(colnames(df)), collapse = ", "), ")",
"WHERE", paste(paste0(tbl_q, ".", id_q), "=",
paste0(tbl_tmp, ".", id_q),
collapse = " AND "))
Simply put, that won't work. They should be suing placeholders. It assumes that the input type can be the result of make_str_quote (by proxy of df_q and quote_str). That's a faulty assumption as seen here,
CREATE TABLE foo ( a uuid );
INSERT INTO foo VALUES ( quote_literal(gen_random_uuid()) ) ;
ERROR: column "a" is of type uuid but expression is of type text
LINE 1: INSERT INTO foo VALUES ( quote_literal(gen_random_uuid()) ) ...
^
HINT: You will need to rewrite or cast the expression.
My suggestion is you follow the docs,
Note: This package is deprecated. For new projects, we recommend using the sf package to interface with geodatabases.
You may be able to work around this by doing this
CREATE CAST (varchar AS uuid)
WITH INOUT
AS IMPLICIT;

In R how can I use multiple prepared statements with DBI + RSQLite simultaneously?

Suppose you're trying to do a typical insert-or-update loop with RSQLite. I would expect the following to work:
library(DBI)
testdb <- dbConnect(RSQLite::SQLite(), "test.sqlite")
dbExecute(testdb, "CREATE TABLE spray_count (spray TEXT, count INTEGER)")
urs <- dbSendStatement(testdb, "UPDATE spray_count SET count = count + :count WHERE spray = :spray")
irs <- dbSendStatement(testdb, "INSERT INTO spray_count VALUES (:spray, :count)")
for (i in 1:nrow(InsectSprays)) {
print(paste("update", i))
dbBind(urs, InsectSprays[i,])
if (!dbGetRowsAffected(urs)) {
print(paste("insert", i))
dbBind(irs, InsectSprays[i,])
}
}
But it does not:
[1] "update 1"
Error in rsqlite_bind_rows(res#ptr, params) :
external pointer is not valid
In addition: Warning message:
Closing open result set, pending rows
Basically it seems that you can have only one prepared statement at a time, and creating a second one somehow invalidates the first. Am I missing something or is this a limitation of DBI and/or RSQLite? I have DBI v0.6-1 and RSQLite v1.1-2.

Parameters and NULL

I'm having trouble passing NULL as an INSERT parameter query using RPostgres and RPostgreSQL:
In PostgreSQL:
create table foo (ival int, tval text, bval bytea);
In R:
This works:
res <- dbSendQuery(con, "INSERT INTO foo VALUES($1, $2, $3)",
params=list(ival=1,
tval= 'not quite null',
bval=charToRaw('asdf')
)
)
But this throws an error:
res <- dbSendQuery(con, "INSERT INTO foo VALUES($1, $2, $3)",
params=list(ival=NULL,
tval= 'not quite null',
bval=charToRaw('asdf')
)
)
Using RPostgres, the error message is:
Error: expecting a string
Under RPostgreSQL, the error is:
Error in postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver: (could not Retrieve the result : ERROR: invalid input
syntax for integer: "NULL"
)
Substituting NA would be fine with me, but it isn't a work-around - a literal 'NA' gets written to the database.
Using e.g. integer(0) gives the same "expecting a string" message.
You can use NULLIF directly in your insert statement:
res <- dbSendQuery(con, "INSERT INTO foo VALUES(NULLIF($1, 'NULL')::integer, $2, $3)",
params=list(ival=NULL,
tval= 'not quite null',
bval=charToRaw('asdf')
)
)
works with NA as well.
One option here to workaround the problem of not knowing how to articulate a NULL value in R which the PostgresSQL pacakge will be able to successfully translate is to simply not specify the column whose value you want to be NULL in the database.
So in your example you could use this:
res <- dbSendQuery(con, "INSERT INTO foo (col2, col3) VALUES($1, $2)",
params=list(tval = 'not quite null',
bval = charToRaw('asdf')
)
)
when you want col1 to have a NULL value. This of course assumes that col1 in your table is nullable, which may not be the case.
Thanks all for the help. Tim's answer is a good one, and I used it to catch the integer values. I went a different route for the rest of it, writing a function in PostgreSQL to handle most of this. It looks roughly like:
CREATE OR REPLACE FUNCTION add_stuff(ii integer, tt text, bb bytea)
RETURNS integer
AS
$$
DECLARE
bb_comp bytea;
rows integer;
BEGIN
bb_comp = convert_to('NA', 'UTF8'); -- my database is in UTF8.
-- front-end catches ii is NA; RPostgres blows up
-- trying to convert 'NA' to integer.
tt = nullif(tt, 'NA');
bb = nullif(bb, bb_comp);
INSERT INTO foo VALUES (ii, tt, bb);
GET DIAGNOSTICS rows = ROW_COUNT;
RETURN rows;
END;
$$
LANGUAGE plpgsql VOLATILE;
Now to have a look at the RPostgres source and see if there's an easy-enough way to make it handle NULL / NA a bit more easily. Hoping that it's missing because nobody thought of it, not because it's super-tricky. :)
This will give the "wrong" answer if someone is trying to put literally 'NA' into the database and mean something other than NULL / NA (e.g. NA = "North America"); given our use case, that seems very unlikely. We'll see in six months time.

Resources