Inserting a R dataframe in SQL table using a stored proc - r

I have a dataframe in R containing 10 rows and 7 columns. There's a stored procedure that does the few logic checks in the background and then inserts the data in the table 'commodity_price'.
library(RMySQL)
#Connection Settings
mydb = dbConnect(MySQL(),
user='uid',
password='pwd',
dbname='database_name',
host='localhost')
#Listing the tables
dbListTables(mydb)
f= data.frame(
location= rep('Bhubaneshwar', 4),
sourceid= c(8,8,9,2),
product= c("Ingot", "Ingot", "Sow Ingot", "Alloy Ingot"),
Specification = c('ie10','ic20','se07','se08'),
Price=c(14668,14200,14280,20980),
currency=rep('INR',4),
uom=rep('INR/MT',4)
)
For multiple rows insert, there's a pre-created stored proc 'PROC_COMMODITY_PRICE_INSERT', which I need to call.
for (i in 1:nrow(f))
{
dbGetQuery(mydb,"CALL PROC_COMMODITY_PRICE_INSERT(
paste(f$location[i],',',
f$sourceid[i],',',f$product[i],',',f$Specification[i],',',
f$Price[i],',',f$currency[i],',', f$uom[i],',',#xyz,')',sep='')
);")
}
I am repeatedly getting error.
Error in .local(conn, statement, ...) :
could not run statement: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '[i],',',
f$sourceid[i],',',f$product[i],',',f$Specification' at line 2
I tried using RODBC but its not getting connected at all. How can I insert the data from the R dataframe in the 'commodity_price' table by calling a stored proc? Thanks in advance!

That is probably due to your use of ', this might work:
for (i in 1:nrow(f))
{
dbGetQuery(mydb,paste("CALL PROC_COMMODITY_PRICE_INSERT(",f$location[i],',',
f$sourceid[i],',',f$product[i],',',f$Specification[i],',',
f$Price[i],',',f$currency[i],',', f$uom[i],',',"#xyz",sep='',");"))
}
or the one-liner:
dbGetQuery(mydb,paste0("CALL PROC_COMMODITY_PRICE_INSERT('",apply(f, 1, paste0, collapse = "', '"),"');"))

Trying the for loop:
for (i in 1:nrow(f))
{
dbGetQuery(mydb,paste("CALL PROC_COMMODITY_PRICE_INSERT(","'",f$location[i],"'",',',"'",
f$sourceid[i],"'",',',"'",f$product[i],"'",',',"'",f$Specification[i],"'",',',"'",
f$Price[i],"'",',',"'",f$currency[i],"'",',',"'",f$uom[i],"'",',','#xyz',sep='',");"))
}

Related

Insert R list into RPostgreSQL query

I'm running a postgreSQL query based on an automated list of ID's stored in an R list. I'm trying to determine how to include that R list in my query so I don't have to hard-code the ID's each time I run my query.
For example, I have a script that produces the list
id <- c("001","002","003")
and my query looks something like this:
SELECT *
FROM my_query
WHERE my_query.id_col IN ('001', '002', '003')
which I run using Rpostgres:
library(Rpostgres)
snappConnection <- DBI::dbConnect(RPostgres::Postgres(),
host = "host",
dbname = "dbname",
user = "user",
password = "pword",
port = 0000)
core.data <- dbGetQuery(conn = snappConnection,statement = SELECT * FROM my_query WHERE my_query.id_col IN ('001', '002', '003'))
Is there a way to reference my "id" list from R in my query so that when "id" updates to new values, the query also updates to those new values?
glue_sql from glue package should work:
query <- glue::glue_sql("
SELECT *
FROM my_query
WHERE my_query.id_col IN ({id*})
", .con = snappConnection)
core.data <- dbGetQuery(conn = snappConnection, statement = query)
#dave-edison's answer solved my problem. Concurrent to trying his, I got this to work.
I saved the query below as "my_query.sql"
SELECT *
FROM my_query
WHERE my_query.id_col IN ('string_to_replace')
then created a string and used gsub on the string.
library(tidyverse)
temp.script <- read_file("my_query.sql")
core.data.script <- gsub('string_to_replace',paste0(id,collapse = "', '"),temp.script)
From there I just ran my RPostgres script like above.

SQL Server ALTER datetime to datetime2 does not work

I am trying to convert a "datetime" variable to "datetime2" format.
# Load libraries
library(DBI)
library(tidyverse)
# Create dataframe
df <- data.frame("myid" = stringi::stri_rand_strings(5, 5),
"mydate" = c(Sys.time(), Sys.time()-1, Sys.time()-2, Sys.time()-3, Sys.time()-4) )
# Create SQL table sschema.ttable
DBI::dbWriteTable(conn = connection,
name = DBI::Id(schema = "sschema", table = "ttable"),
value = df,
overwrite = TRUE,
append = FALSE)
# Query for variable type in the SQL table
query <- paste0("exec sp_columns ", "ttable")
query <- DBI::dbSendQuery(connection, query)
res <- NULL
res <- DBI::dbFetch(query)
DBI::dbClearResult(query)
view(res)
# Alter mydate to datetime2
query <- DBI::dbSendStatement(conn = connection,
statement = paste0("ALTER TABLE sschema.ttable ALTER COLUMN mydate datetime2"))
DBI::dbFetch(query)
DBI::dbClearResult(query)
but this leads to the error
Error: nanodbc/nanodbc.cpp:1617: 00000: [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]The UPDATE permission was denied on the object 'ttable', database 'dbo', schema 'sschema'.
'ALTER TABLE sschema.ttablename ALTER COLUMN mydate datetime2'
However, converting another VARCHAR(10) variable in the same table to VARCHAR(100) works fine. Any idea what is the problem? How to get this working?
I am working with Microsoft SQL Azure version 12, by operating on an RStudio-server and the DBI library.
To change the data type of a column you must have both the ALTER permission and UPDATE permission on the table.
From the docs:
Adding a column that updates the rows of the table requires UPDATE permission on the table.
ALTER TABLE - permissions
This goes for ALTERing an existing column too, as you can verify like this:
use tempdb
go
revert
go
if exists(select * from sys.database_principals where name = 'fred')
drop user fred
go
drop table if exists tablename
go
create user fred without login
create table tablename(id int, variablename varchar(20))
go
grant select on tablename to fred
--grant update on tablename to fred --uncomment to clear error
grant alter on schema::dbo to fred
execute as user='fred'
ALTER TABLE dbo.tablename ALTER COLUMN variablename datetime2
revert

Query a Postgresql table using variable in R

I have a few tablea in my postgresql database with names similar to: "114114.KP". I would like to dynamically call the table from R using RPostgres package.
I know that the name of the table makes it difficult to process. But is there a way to do it pls?
What I did so far, with i = "114114.KP" :
if(dbExistsTable(con5, i) == TRUE){
ExistingPortfolios <- dbSendQuery(con3, paste("SELECT * from \"", i,"\""))
ExistingPortfolios <- fetch(ExistingPortfolios)
}
But this returns an error (which is normal):
> ExistingPortfolios <- dbSendQuery(con3, paste("SELECT * from \"", i,"\""))
Error in postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver: (connection with pending rows, close resultSet before continuing)

In R how can I use multiple prepared statements with DBI + RSQLite simultaneously?

Suppose you're trying to do a typical insert-or-update loop with RSQLite. I would expect the following to work:
library(DBI)
testdb <- dbConnect(RSQLite::SQLite(), "test.sqlite")
dbExecute(testdb, "CREATE TABLE spray_count (spray TEXT, count INTEGER)")
urs <- dbSendStatement(testdb, "UPDATE spray_count SET count = count + :count WHERE spray = :spray")
irs <- dbSendStatement(testdb, "INSERT INTO spray_count VALUES (:spray, :count)")
for (i in 1:nrow(InsectSprays)) {
print(paste("update", i))
dbBind(urs, InsectSprays[i,])
if (!dbGetRowsAffected(urs)) {
print(paste("insert", i))
dbBind(irs, InsectSprays[i,])
}
}
But it does not:
[1] "update 1"
Error in rsqlite_bind_rows(res#ptr, params) :
external pointer is not valid
In addition: Warning message:
Closing open result set, pending rows
Basically it seems that you can have only one prepared statement at a time, and creating a second one somehow invalidates the first. Am I missing something or is this a limitation of DBI and/or RSQLite? I have DBI v0.6-1 and RSQLite v1.1-2.

Parameters and NULL

I'm having trouble passing NULL as an INSERT parameter query using RPostgres and RPostgreSQL:
In PostgreSQL:
create table foo (ival int, tval text, bval bytea);
In R:
This works:
res <- dbSendQuery(con, "INSERT INTO foo VALUES($1, $2, $3)",
params=list(ival=1,
tval= 'not quite null',
bval=charToRaw('asdf')
)
)
But this throws an error:
res <- dbSendQuery(con, "INSERT INTO foo VALUES($1, $2, $3)",
params=list(ival=NULL,
tval= 'not quite null',
bval=charToRaw('asdf')
)
)
Using RPostgres, the error message is:
Error: expecting a string
Under RPostgreSQL, the error is:
Error in postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver: (could not Retrieve the result : ERROR: invalid input
syntax for integer: "NULL"
)
Substituting NA would be fine with me, but it isn't a work-around - a literal 'NA' gets written to the database.
Using e.g. integer(0) gives the same "expecting a string" message.
You can use NULLIF directly in your insert statement:
res <- dbSendQuery(con, "INSERT INTO foo VALUES(NULLIF($1, 'NULL')::integer, $2, $3)",
params=list(ival=NULL,
tval= 'not quite null',
bval=charToRaw('asdf')
)
)
works with NA as well.
One option here to workaround the problem of not knowing how to articulate a NULL value in R which the PostgresSQL pacakge will be able to successfully translate is to simply not specify the column whose value you want to be NULL in the database.
So in your example you could use this:
res <- dbSendQuery(con, "INSERT INTO foo (col2, col3) VALUES($1, $2)",
params=list(tval = 'not quite null',
bval = charToRaw('asdf')
)
)
when you want col1 to have a NULL value. This of course assumes that col1 in your table is nullable, which may not be the case.
Thanks all for the help. Tim's answer is a good one, and I used it to catch the integer values. I went a different route for the rest of it, writing a function in PostgreSQL to handle most of this. It looks roughly like:
CREATE OR REPLACE FUNCTION add_stuff(ii integer, tt text, bb bytea)
RETURNS integer
AS
$$
DECLARE
bb_comp bytea;
rows integer;
BEGIN
bb_comp = convert_to('NA', 'UTF8'); -- my database is in UTF8.
-- front-end catches ii is NA; RPostgres blows up
-- trying to convert 'NA' to integer.
tt = nullif(tt, 'NA');
bb = nullif(bb, bb_comp);
INSERT INTO foo VALUES (ii, tt, bb);
GET DIAGNOSTICS rows = ROW_COUNT;
RETURN rows;
END;
$$
LANGUAGE plpgsql VOLATILE;
Now to have a look at the RPostgres source and see if there's an easy-enough way to make it handle NULL / NA a bit more easily. Hoping that it's missing because nobody thought of it, not because it's super-tricky. :)
This will give the "wrong" answer if someone is trying to put literally 'NA' into the database and mean something other than NULL / NA (e.g. NA = "North America"); given our use case, that seems very unlikely. We'll see in six months time.

Resources