Problems creating and populating tables to HANA RODBC R - r

I am trying to write data to a table in a particular schema in HANA (SPS 11) using RODBC package in R and am having problems that I hope someone can help with.
I am using the sqlSave to create the file and write to it, using below command, but getting weird results.
res <- sqlSave(ch, dim_product_master_test, tablename = table.for.save, rownames = FALSE, verbose = TRUE)
Query: CREATE TABLE MYSCHEMA."DIM_PRODUCTSX" ("ProdSrcMonth" varchar(255), "Category" varchar(255), "SubCategory" varchar(255), "Brand" varchar(255), "Material" INTEGER, "Product" varchar(255), "EAN" varchar(255) .... etc)
I am getting the error:
Error in sqlColumns(channel, tablename) :
‘MYSCHEMA."DIM_PRODUCTSX"’: table not found on channel
However, the table is being created, then it can't seem to add the data or find it.
I tried with different quotes scheme (including around the schema name) but same result.
Query: CREATE TABLE "MYSCHEMA"."DIM_PRODUCTSY" ("ProdSrcMonth" varchar(255), "Category" varchar(255), "SubCategory" varchar(255), "Brand" varchar(255), "Material" INTEGER, "Product" varchar(255), "EAN" varchar(255) ... etc
Error in sqlColumns(channel, tablename) :
‘"MYSCHEMA"."DIM_PRODUCTSY"’: table not found on channel
Tried quoting both, but no difference. Again, creates table but cannot update it.
If I just throw the dataframe at sqlSave, it happily creates the table and adds the data but I need more control that that.
Also, anyone know how to create column store tables? seems to default to row store.
Thanks in advance.

Generally, it's a good idea to specify the target table in SAP HANA beforehand. That way things like COLUMN/ROW store setting and the specific data types for each column can be set as they should be (e.g. sqlSave doesn't seem to create NVARCHAR columns even when UNICODE data needs to be saved).
This is an example that just works out of the box for me (also SPS11):
library("RODBC")
ch<-odbcConnect("SK1", uid="DEVDUDE",pwd="*******")
table.for.save <- 'AIRQUALITY'
aqdata <- airquality
sqlSave(ch,dat = aqdata, tablename = table.for.save, verbose = TRUE, rownames = FALSE)
odbcClose(ch)
Query: CREATE TABLE "AIRQUALITY" ("Ozone" INTEGER, "SolarR" INTEGER, "Wind" DOUBLE, "Temp" INTEGER, "Month" INTEGER, "Day" INTEGER)
Query: INSERT INTO "AIRQUALITY" ( "Ozone", "SolarR", "Wind", "Temp", "Month", "Day" ) VALUES ( ?,?,?,?,?,? )
Binding: 'Ozone' DataType 4, ColSize 10
Binding: 'SolarR' DataType 4, ColSize 10
Binding: 'Wind' DataType 8, ColSize 15
Binding: 'Temp' DataType 4, ColSize 10
Binding: 'Month' DataType 4, ColSize 10
Binding: 'Day' DataType 4, ColSize 10
Parameters:
no: 1: Ozone 41//no: 2: SolarR 190//no: 3: Wind 7.4//no: 4: Temp 67//no: 5: Month 5//no: 6: Day 1//
...

Related

How to insert data from R into Oracle table with identity column?

Assume I have a simple table in Oracle db
CREATE TABLE schema.d_test
(
id_record integer GENERATED AS IDENTITY START WITH 95000 NOT NULL,
DT DATE NOT NULL,
var varchar(50),
num float,
PRIMARY KEY (ID_RECORD)
)
And I have a dataframe in R
dt = c('2022-01-01', '2005-04-01', '2011-10-02')
var = c('sgdsg', 'hjhgjg', 'rurtur')
num = c(165, 1658.5, 8978.12354)
data = data.frame(dt, var, num)%>%
mutate(dt = as.Date(dt))
I'm trying to insert data into Oracle d_test table using the code
data %>%
dbWriteTable(
oracle_con,
value = .,
date = T,
'D_TEST',
append = T,
row.names=F,
overwrite = F
)
But the following error returned
Error in .oci.WriteTable(conn, name, value, row.names = row.names, overwrite = overwrite, :
Error in .oci.GetQuery(con, stmt, data = value) :
ORA-00947: not enough values
What's the problem?
How can I fix it?
Thank you.
This is pure Oracle (I don't know R).
Sample table:
SQL> create table test_so (id number generated always as identity not null, name varchar2(20));
Table created.
SQL> insert into test_so(name) values ('Name 1');
1 row created.
My initial idea was to suggest you to insert any value into the ID column, hoping that Oracle would discard it and generate its own value. However, that won't work.
SQL> insert into test_so (id, name) values (-100, 'Name 2');
insert into test_so (id, name) values (-100, 'Name 2')
*
ERROR at line 1:
ORA-32795: cannot insert into a generated always identity column
But, if you can afford recreating the table so that it doesn't automatically generate the ID column's value but use a "workaround" (we used anyway, as identity columns are relatively new in Oracle) - a sequence and a trigger - you might be able to "fix" it.
SQL> drop table test_so;
Table dropped.
SQL> create table test_so (id number not null, name varchar2(20));
Table created.
SQL> create sequence seq_so;
Sequence created.
SQL> create or replace trigger trg_bi_so
2 before insert on test_so
3 for each row
4 begin
5 :new.id := seq_so.nextval;
6 end;
7 /
Trigger created.
Inserting only name (Oracle will use a trigger to populate ID):
SQL> insert into test_so(name) values ('Name 1');
1 row created.
This is what you'll do in your code - provide dummy ID value, just to avoid
ORA-00947: not enough values
error you have now. Trigger will discard it and use sequence anyway:
SQL> insert into test_so (id, name) values (-100, 'Name 2');
1 row created.
SQL> select * from test_so;
ID NAME
---------- --------------------
1 Name 1
2 Name 2 --> this is a row which was supposed to have ID = -100
SQL>
The way you can handle this problem is to create table with GENERATED BY DEFAULT ON NULL AS IDENTITY like this
CREATE TABLE CM_RISK.d_test
(
id_record integer GENERATED BY DEFAULT ON NULL AS IDENTITY START WITH 5000 NOT NULL ,
DT date NOT NULL,
var varchar(50),
num float,
PRIMARY KEY (ID_RECORD)
)

SQL Server ALTER datetime to datetime2 does not work

I am trying to convert a "datetime" variable to "datetime2" format.
# Load libraries
library(DBI)
library(tidyverse)
# Create dataframe
df <- data.frame("myid" = stringi::stri_rand_strings(5, 5),
"mydate" = c(Sys.time(), Sys.time()-1, Sys.time()-2, Sys.time()-3, Sys.time()-4) )
# Create SQL table sschema.ttable
DBI::dbWriteTable(conn = connection,
name = DBI::Id(schema = "sschema", table = "ttable"),
value = df,
overwrite = TRUE,
append = FALSE)
# Query for variable type in the SQL table
query <- paste0("exec sp_columns ", "ttable")
query <- DBI::dbSendQuery(connection, query)
res <- NULL
res <- DBI::dbFetch(query)
DBI::dbClearResult(query)
view(res)
# Alter mydate to datetime2
query <- DBI::dbSendStatement(conn = connection,
statement = paste0("ALTER TABLE sschema.ttable ALTER COLUMN mydate datetime2"))
DBI::dbFetch(query)
DBI::dbClearResult(query)
but this leads to the error
Error: nanodbc/nanodbc.cpp:1617: 00000: [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]The UPDATE permission was denied on the object 'ttable', database 'dbo', schema 'sschema'.
'ALTER TABLE sschema.ttablename ALTER COLUMN mydate datetime2'
However, converting another VARCHAR(10) variable in the same table to VARCHAR(100) works fine. Any idea what is the problem? How to get this working?
I am working with Microsoft SQL Azure version 12, by operating on an RStudio-server and the DBI library.
To change the data type of a column you must have both the ALTER permission and UPDATE permission on the table.
From the docs:
Adding a column that updates the rows of the table requires UPDATE permission on the table.
ALTER TABLE - permissions
This goes for ALTERing an existing column too, as you can verify like this:
use tempdb
go
revert
go
if exists(select * from sys.database_principals where name = 'fred')
drop user fred
go
drop table if exists tablename
go
create user fred without login
create table tablename(id int, variablename varchar(20))
go
grant select on tablename to fred
--grant update on tablename to fred --uncomment to clear error
grant alter on schema::dbo to fred
execute as user='fred'
ALTER TABLE dbo.tablename ALTER COLUMN variablename datetime2
revert

How to append to SQLite table in R with autogenerated fields

This is a similar problem to this question, but I do not want the missing columns filled in with NA, because the missing columns have meaningful default values including the primary key.
I am trying to append to a SQLite table from R where the table has some auto-generated fields, specifically the primary key, and two timestamp values. The first timestamp is the created date, and the second timestamp is a modified date.
Here is the table structure:
CREATE TABLE "level1" (
"l1id" bigint(20) NOT NULL ,
"l0id" bigint(20) DEFAULT NULL,
"acid" bigint(20) DEFAULT NULL,
"cndx" int(11) DEFAULT NULL,
"repi" int(11) DEFAULT NULL,
"created_date" timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
"modified_date" timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
"modified_by" varchar(100) DEFAULT NULL,
PRIMARY KEY ("l1id")
)
When I have tried doing the exact same thing using MySQL, dbWriteTable automatically handles the default values for missing columns, and populates the primary key and created_date properly (AND it matches the order of the columns automatically).
How can I achieve the same behavior with the RSQLite package? I am not sure if I have the database configured incorrectly, or if I need some addtional steps within R?
I have tried pre-populating the missing fields with NA & 'null', but in both cases I get an error saying:
Warning message:
In value[[3L]](cond) :
RS-DBI driver: (RS_SQLite_exec: could not execute: column l1id is not unique)
And the data does not get written.
The Solution
I figured out a solution, based largely on the dbWriteFactor function Ari Friedman wrote as an answer to his question. Below I show the portion of code I used, modified to work specifically with the data.table package.
It is also very important to note that I had to change the sqlite table structure. To get this to work I had to remove the "NOT NULL" designation from all auto-generated fields.
New Table Structure
CREATE TABLE "level1" (
"l1id" INTEGER PRIMARY KEY,
"l0id" bigint(20) DEFAULT NULL,
"acid" bigint(20) DEFAULT NULL,
"cndx" int(11) DEFAULT NULL,
"repi" int(11) DEFAULT NULL,
"created_date" timestamp DEFAULT CURRENT_TIMESTAMP,
"modified_date" timestamp DEFAULT '0000-00-00 00:00:00',
"modified_by" varchar(100) DEFAULT NULL
);
Adapted Code Sample
dbcon <- do.call(dbConnect, db_pars)
tempTbl <- "temp_table"
if (dbExistsTable(dbcon, tempTbl)) dbRemoveTable(dbcon, tempTbl)
dbWriteTable(conn = dbcon,
name = tempTbl,
value = dat,
row.names = FALSE,
append = FALSE)
tbl_flds <- loadColNames(tbl, db)
tmp_flds <- names(dat)
status <- dbSendQuery(dbcon,
paste("INSERT INTO", tbl,
"(", paste(tmp_flds, collapse = ","), ")",
"SELECT",
paste(tmp_flds, collapse = ","),
"FROM",
tempTbl))
# Remove temporary table
dbRemoveTable(dbcon, tempTbl)
dbDisconnect(dbcon)
where db_pars is a list of database parameters to establish the connection.

Single insertion of data on one date in SQL Server?

ALTER PROCEDURE [dbo].[K_FS_InsertMrpDetails]
#date datetime,
#feedtype varchar(50),
#rateperkg float,
#rateper50kg float,
#updatedby varchar(50)
AS
BEGIN
INSERT INTO K_FS_FeedMrpDetails([date], feedtype, rateperkg, rateper50kg, updatedby, updatedon)
VALUES(#date, #feedtype, #rateperkg, #rateper50kg, #updatedby, getdate())
SELECT '1' AS status
END
With this query we insert 9 rows at a time but what I want is in one same date do not insert again different details. How can I please help me.
Add a unique constraint on the column [date]. That will prevent you from adding more than one row with the same [date] value.
Update:
To allow 9 rows for each date you can add a computed column D that removes the time part and you need to add a column that will hold the values 1 to 9 R. Use a check constraint on R to only allow 1-9. Finally you create a unique constraint on (R, D).
Sample table definition:
create table T
(
ID int identity primary key,
DT datetime not null,
R tinyint check (R in (1,2,3,4,5,6,7,8,9)) not null,
D as dateadd(day, datediff(day, 0, DT), 0),
constraint ux_RD unique (R,D)
)
Try with this:
insert into T(DT, R) values(getdate(), 1)
insert into T(DT, R) values(getdate(), 2)
insert into T(DT, R) values(getdate(), 1)
First and second insert works fine, the third raises a unique constraint exception.

How to optimize SQLite indexes for query

I have a SQLite table as
CREATE TABLE T(
CategoryCode NVARCHAR(64) NOT NULL,
DateTime DateTime NOT NULL,
ItemCode NVARCHAR(64) NOT NULL,
ItemName NVARCHAR(64) NOT NULL,
ItemValue NUMERIC(28, 4) NOT NULL
)
The question is how to optimize indexes for the following query:
SELECT
CategoryCode
,ItemCode
,ItemName
,SUM(ItemValue) as TotalValue
FROM T
WHERE CategoryCode = 'Code1'
AND DateTime < '2012-01-04 00:00:00'
GROUP BY ItemCode
Thank you!
For the exact query, you will need an index on T(CategoryCode, DateTime) or T(DateTime, CategoryCode), depending on which column is more selective than the other.
However, it is unwise to create an index for a single query without a more holistic view on all access to the table.
e.g. You may find, for example, that if most data in the table has CategoryCode = 'Code1' then the index should only be created on the DateTime column.

Resources