Write to Snowflake VARIANT column from R - r

I am trying to load data to snowflake using the following code, but getting an error.
con <- DBI::dbConnect(
drv = odbc::odbc(),
driver = "SnowflakeDSIIDriver",
server = "<>",
authenticator = 'externalbrowser',
warehouse = "<>",
database = "<>",
UID = "<>",
role = "<>"
)
DBI::dbAppendTable(con, name = DBI::Id(schema = "<>", table = "<>"), value = tmp[1:2,])
tmp was downloaded from Snowflake, the same table using RStudio:
```{sql connection=con, output.var = 'tmp'}
select top 10 *
FROM <>
```
The error seems to be stemming from a VARIANT column where I store a JSON string.
Error in new_result(connection#ptr, statement, immediate) :
nanodbc/nanodbc.cpp:1374: 22000: SQL compilation error:
Expression type does not match column data type, expecting VARIANT but got VARCHAR(2) for column FEATURES

I had this once and it was an invalid JSON (missing brackets somewhere). Probably this helps.

Related

Insert R dataframe into SQL (RODBC) - error table not found

I would like to drop my whole dataframe from R preferably using RODBC with sqlSave statement (not sqlQuery). Here is my sample code.
library(RODBC)
myconn <- odbcDriverConnect("some connection string")
mydf <- data.frame(col_1 = c(1,2,3), col_2 = c(2,3,4))
sqlSave(myconn, mydf, tablename = '[some_db].[some_schema].[my_table]', append = F, rownames = F, verbose=TRUE)
odbcClose(myconn)
After I execute it, I get back error message:
Error in sqlColumns(channel, tablename) :
‘my_table’: table not found on channel
When I check in SQL Server, an empty table is present.
If I run the same code again, I get error message:
Error in sqlSave(myconn, mydf, tablename = "[some_db].[some_schema].[my_table]", :
42S01 2714 [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]There is already an object named 'my_table' in the database.
[RODBC] ERROR: Could not SQLExecDirect 'CREATE TABLE [some_db].[some_schema].[my_table] ("col_1" float, "col_2" float)'
Any suggestions on how to troubleshoot?
UPDATE
In SSMS I can run the following commands successfully:
CREATE TABLE [some_db].[some_schema].[my_table] (
test int
);
drop table [some_db].[some_schema].[my_table]
Here are details of connection string:
Driver=ODBC Driver 17 for SQL Server; Server=someserveraddress; Uid=user_login; Pwd=some_password
To avoid the error, you could specify the database in the connection string:
Driver=ODBC Driver 17 for SQL Server; Server = someserveraddress; database = some_db; Uid = user_login; Pwd = some_password
and avoid using brackets:
sqlSave(myconn, mydf, tablename = 'some_schema.my_table', append = F, rownames = F, verbose=TRUE)

RPostgreSQL and DBI: "operator does not exist: uuid = text"

When using dbReadTable to read in database tables that uses UUID as the primary key, I get the following warning message.
1: In postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver warning: (unrecognized PostgreSQL field type uuid (id:2950) in column 0)
When I modify the table I loaded and try to update the database using, I get the following error message:
Error in postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver: (could not Retrieve the result : ERROR: operator does not exist: uuid = text
I get that the UUID type is not available in R, but is there a way that we can make the database believe the character vector "unique_id" is UUID instead of text?
Code:
library(RPostgreSQL)
library(postGIStools)
pgdrv <- dbDriver(drvName = "PostgreSQL")
# === open connection
db <- DBI::dbConnect(pgdrv,
dbname="database",
host="localhost", port=5432,
user = 'postgres')
# === get tables
users <- dbReadTable(db, "app_users")
# === interaction with tables
users$employee_has_quit[1:5] <- TRUE
# === update tables
postgis_update(conn = db,
df = users,
tbl = "app_users",
id_cols = "unique_id",
update_cols = "employee_has_quit")
# === close conncetion
DBI::dbDisconnect(db)
The problem is a bug in postGIStools. You can see the code they're using to generate this error here
query_text <- paste(query_text, ") AS", tbl_tmp, "(",
paste(quote_id(colnames(df)), collapse = ", "), ")",
"WHERE", paste(paste0(tbl_q, ".", id_q), "=",
paste0(tbl_tmp, ".", id_q),
collapse = " AND "))
Simply put, that won't work. They should be suing placeholders. It assumes that the input type can be the result of make_str_quote (by proxy of df_q and quote_str). That's a faulty assumption as seen here,
CREATE TABLE foo ( a uuid );
INSERT INTO foo VALUES ( quote_literal(gen_random_uuid()) ) ;
ERROR: column "a" is of type uuid but expression is of type text
LINE 1: INSERT INTO foo VALUES ( quote_literal(gen_random_uuid()) ) ...
^
HINT: You will need to rewrite or cast the expression.
My suggestion is you follow the docs,
Note: This package is deprecated. For new projects, we recommend using the sf package to interface with geodatabases.
You may be able to work around this by doing this
CREATE CAST (varchar AS uuid)
WITH INOUT
AS IMPLICIT;

R ODCB/DBI writing to a SQL Server table with non-default schemas

I've been very excited about the new schema behavior in dbWriteTable but have not been able to get it working. I'm running DBI 0.7-15 and SQL Server 2012. Perhaps I am just using the Id function incorrectly?
The following code block cannot find the correct method.
cs <- "driver={SQL Server};
server={localhost};
database=testSAM;
trusted_connection=true;"
con <- DBI::dbConnect(odbc::odbc(), .connection_string = cs)
df <- data.frame(id=1, word_of_day = "happy")
table_id <- DBI::Id(name = "hcai_unit_tests",
schema = "dbo",
catalog = "testSAM")
# Try with Id
res <- DBI::dbWriteTable(conn = con,
name = table_id,
value = df,
append = TRUE)
# Errors with:
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘dbWriteTable’ for
signature ‘"Microsoft SQL Server", "SQL", "missing"’
Trying to use dbQuoteIdentifier produces a different error
res <- odbc::dbWriteTable(conn = con,
name = DBI::dbQuoteIdentifier(con, t),
value = df,
append = TRUE)
# Errors with:
Error: <SQL> 'CREATE TABLE "testSAM"."dbo"."hcai_unit_tests" (
"id" FLOAT,
"word_of_day" varchar(255)
)
'
nanodbc/nanodbc.cpp:1587: 42S01: [Microsoft][ODBC SQL Server Driver]
[SQL Server]There is already an object named
'hcai_unit_tests' in the database.
Here, append=TRUE should be preventing this error from coming up...

Insert R data.frame data into HIVE DB

I have a data.frame with 1000-2000 records. I am using RJDBC to insert into HIVE and followed below steps but I am getting an error
library(RJDBC)
drv <- JDBC(driverClass = "org.apache.hive.jdbc.HiveDriver", classPath = list.files("C:/R
Install/hive_lib",pattern="jar$",full.names=T),identifier.quote="'")
conn <- dbConnect(drv, "servername", "username", "pwd")
dbWriteTable(conn, name = tablename, value = data.frame, row.names = FALSE)
Error:
Error in .local(conn, statement, ...) :
execute JDBC update query failed in dbSendUpdate (Error while compiling statement: FAILED: ParseException line 1:334 mismatched input 'PRECISION' expecting ) near 'DOUBLE' in create table statement)
What is the best way to insert R data frame records into HIVE?

Programmatically building SQL Query R/Shiny/RODBC

I'm building a SQL Query statement using inputDateRange() in R/Shiny. My issue is in handling various strings to include the dates into the WHERE condition of the SQL:
Here is my code:
t.query <- paste0("Select [sensor_name], [temperature] from [dbo].
[temperature_sensor] where network_id = '24162' and date > "
, sQuote(format(input$my.dateRange[1], format="%d-%m-%Y"))
, " and date < "
, sQuote(format(input$my.dateRange[2], format="%d-%m-%Y"))
)
Now the statement closes with a single quote and I receive the error below:
42000 102 [Microsoft][ODBC Driver 13 for SQL Server][SQL
Server]Incorrect syntax near '‘'. [RODBC] ERROR: Could not
SQLExecDirect 'Select [sensor_name], [temperature] from
[dbo].[temperature_sensor] where network_id = '24162' and date >
‘18-09-2017’ and date < ‘22-09-2017’'
I need to close the string with " as I started it in "select ...., I tried to explicitly add """ or dQuote("") to concatenate " but I'm still encountering an error.
Any advice is highly appreciated?
I'd recommend using RODBCext, which will allow you to parameterize your query as
library(RODBCext)
channel <- odbcConnect(...) # make your connection object here
Data <-
sqlExecute(channel = channel,
query = "Select [sensor_name], [temperature]
from [dbo].[temperature_sensor]
where network_id = ? and date between ? and ?",
data = list('24162',
format(input$my.dateRange[1],
format = "%Y-%m-%d"),
format(input$my.dateRange[2],
format = "%Y-%m-%d")),
fetch = TRUE,
stringsAsFactors = FALSE)
This approach has a lot of advantages, including removing the frustration of matching quotes (which you shouldn't do because of the next reason), and protecting your data against SQL injection.

Resources