This is a similar problem to this question, but I do not want the missing columns filled in with NA, because the missing columns have meaningful default values including the primary key.
I am trying to append to a SQLite table from R where the table has some auto-generated fields, specifically the primary key, and two timestamp values. The first timestamp is the created date, and the second timestamp is a modified date.
Here is the table structure:
CREATE TABLE "level1" (
"l1id" bigint(20) NOT NULL ,
"l0id" bigint(20) DEFAULT NULL,
"acid" bigint(20) DEFAULT NULL,
"cndx" int(11) DEFAULT NULL,
"repi" int(11) DEFAULT NULL,
"created_date" timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
"modified_date" timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
"modified_by" varchar(100) DEFAULT NULL,
PRIMARY KEY ("l1id")
)
When I have tried doing the exact same thing using MySQL, dbWriteTable automatically handles the default values for missing columns, and populates the primary key and created_date properly (AND it matches the order of the columns automatically).
How can I achieve the same behavior with the RSQLite package? I am not sure if I have the database configured incorrectly, or if I need some addtional steps within R?
I have tried pre-populating the missing fields with NA & 'null', but in both cases I get an error saying:
Warning message:
In value[[3L]](cond) :
RS-DBI driver: (RS_SQLite_exec: could not execute: column l1id is not unique)
And the data does not get written.
The Solution
I figured out a solution, based largely on the dbWriteFactor function Ari Friedman wrote as an answer to his question. Below I show the portion of code I used, modified to work specifically with the data.table package.
It is also very important to note that I had to change the sqlite table structure. To get this to work I had to remove the "NOT NULL" designation from all auto-generated fields.
New Table Structure
CREATE TABLE "level1" (
"l1id" INTEGER PRIMARY KEY,
"l0id" bigint(20) DEFAULT NULL,
"acid" bigint(20) DEFAULT NULL,
"cndx" int(11) DEFAULT NULL,
"repi" int(11) DEFAULT NULL,
"created_date" timestamp DEFAULT CURRENT_TIMESTAMP,
"modified_date" timestamp DEFAULT '0000-00-00 00:00:00',
"modified_by" varchar(100) DEFAULT NULL
);
Adapted Code Sample
dbcon <- do.call(dbConnect, db_pars)
tempTbl <- "temp_table"
if (dbExistsTable(dbcon, tempTbl)) dbRemoveTable(dbcon, tempTbl)
dbWriteTable(conn = dbcon,
name = tempTbl,
value = dat,
row.names = FALSE,
append = FALSE)
tbl_flds <- loadColNames(tbl, db)
tmp_flds <- names(dat)
status <- dbSendQuery(dbcon,
paste("INSERT INTO", tbl,
"(", paste(tmp_flds, collapse = ","), ")",
"SELECT",
paste(tmp_flds, collapse = ","),
"FROM",
tempTbl))
# Remove temporary table
dbRemoveTable(dbcon, tempTbl)
dbDisconnect(dbcon)
where db_pars is a list of database parameters to establish the connection.
Related
I've 2 tables in my postegreSQL database
CREATE TABLE touriste (
idclient BIGSERIAL PRIMARY KEY,
numclient INT,
nameclient VARCHAR(500),
codepost INT,
departement VARCHAR(500),
pays VARCHAR(100)
);
CREATE TABLE reservation (
idresa BIGSERIAL NOT NULL,
PRIMARY KEY(idresa),
dateresa DATE,
datearriv DATE,
datedep DATE,
idclient_cli BIGINT
REFERENCES touriste (idclient) MATCH FULL ON UPDATE CASCADE ON DELETE RESTRICT,
);
I tried to fullfill the database tables with my dataframe (which is already created in R) by using RPostgreSQL library. The probleme is that the column idclient_cli is empty.
Here is my R code :
dbWriteTable(con, "touriste",
value = dataAdb[, c(2:4, 17:19)], append = TRUE, row.names = FALSE)
# query the data from postgreSQL
df_postgres_tou <- dbGetQuery(con, "SELECT * from touriste")
View(df_postgres_tou)
dbWriteTable(con, "reservation",
value = dataAdb[, c(5:7, 14, 12, 13, 15:16, 20)], append = TRUE, row.names = FALSE)
# query the data from postgreSQL
df_postgres_resa <- dbGetQuery(con, "SELECT * from reservation")
View(df_postgres_resa)
My question is how can I match idclient value with the idclient_cli
Thanks in advance,
On MariaDB-10.2.7, with a table schema:
CREATE TABLE items (
id BIGINT(20) NOT NULL,
deleted_at TIMESTAMP NULL
) ENGINE=innodb ;
The query:
LOAD DATA INFILE '/items.csv'
INTO TABLE items
SET deleted_at = NULLIF(deleted_at, 'NULL') ;
items.csv (tab separated):
1 NULL
2 2019-07-24
The result:
ERROR 1292 (22007): Incorrect datetime value: 'NULL' for column 'deleted_at' at row 1
In the CSV, some of deleted_at are NULL as string (not \N). I'd like to convert it to NULL when running LOAD DATA.
I think you need to do it in 2 steps:
LOAD DATA ...
(col1, col2, #deleted_at, col4)
SET deleted_at = NULLIF(#deleted_at, 'NULL')
SET sql_mode = '' ;
solved the error. MariaDB >=10.2.4 introduced STRICT_TRANS_TABLES as default sql_mode and it makes 'NULL' (string) as invalid for timestamp, so deleted_at will never be 'NULL'.
sql_mode = '' does not raise the error, so NULLIF(deleted_at, 'NULL') works like NULLIF('NULL', 'NULL') in the first row of items.csv as I expected wrongly.
I am trying to write data to a table in a particular schema in HANA (SPS 11) using RODBC package in R and am having problems that I hope someone can help with.
I am using the sqlSave to create the file and write to it, using below command, but getting weird results.
res <- sqlSave(ch, dim_product_master_test, tablename = table.for.save, rownames = FALSE, verbose = TRUE)
Query: CREATE TABLE MYSCHEMA."DIM_PRODUCTSX" ("ProdSrcMonth" varchar(255), "Category" varchar(255), "SubCategory" varchar(255), "Brand" varchar(255), "Material" INTEGER, "Product" varchar(255), "EAN" varchar(255) .... etc)
I am getting the error:
Error in sqlColumns(channel, tablename) :
‘MYSCHEMA."DIM_PRODUCTSX"’: table not found on channel
However, the table is being created, then it can't seem to add the data or find it.
I tried with different quotes scheme (including around the schema name) but same result.
Query: CREATE TABLE "MYSCHEMA"."DIM_PRODUCTSY" ("ProdSrcMonth" varchar(255), "Category" varchar(255), "SubCategory" varchar(255), "Brand" varchar(255), "Material" INTEGER, "Product" varchar(255), "EAN" varchar(255) ... etc
Error in sqlColumns(channel, tablename) :
‘"MYSCHEMA"."DIM_PRODUCTSY"’: table not found on channel
Tried quoting both, but no difference. Again, creates table but cannot update it.
If I just throw the dataframe at sqlSave, it happily creates the table and adds the data but I need more control that that.
Also, anyone know how to create column store tables? seems to default to row store.
Thanks in advance.
Generally, it's a good idea to specify the target table in SAP HANA beforehand. That way things like COLUMN/ROW store setting and the specific data types for each column can be set as they should be (e.g. sqlSave doesn't seem to create NVARCHAR columns even when UNICODE data needs to be saved).
This is an example that just works out of the box for me (also SPS11):
library("RODBC")
ch<-odbcConnect("SK1", uid="DEVDUDE",pwd="*******")
table.for.save <- 'AIRQUALITY'
aqdata <- airquality
sqlSave(ch,dat = aqdata, tablename = table.for.save, verbose = TRUE, rownames = FALSE)
odbcClose(ch)
Query: CREATE TABLE "AIRQUALITY" ("Ozone" INTEGER, "SolarR" INTEGER, "Wind" DOUBLE, "Temp" INTEGER, "Month" INTEGER, "Day" INTEGER)
Query: INSERT INTO "AIRQUALITY" ( "Ozone", "SolarR", "Wind", "Temp", "Month", "Day" ) VALUES ( ?,?,?,?,?,? )
Binding: 'Ozone' DataType 4, ColSize 10
Binding: 'SolarR' DataType 4, ColSize 10
Binding: 'Wind' DataType 8, ColSize 15
Binding: 'Temp' DataType 4, ColSize 10
Binding: 'Month' DataType 4, ColSize 10
Binding: 'Day' DataType 4, ColSize 10
Parameters:
no: 1: Ozone 41//no: 2: SolarR 190//no: 3: Wind 7.4//no: 4: Temp 67//no: 5: Month 5//no: 6: Day 1//
...
I have two tables joined together with third many-to-many relation. I'm trying to do select, but SQLite (version 3.11.0) keep telling me that my one of them doesn't exist which is not true! I have no idea what am I doing wrong.
Here are my tables:
DROP TABLE IF EXISTS traits;
CREATE TABLE traits(
trait_id INTEGER UNIQUE NOT NULL CHECK(TYPEOF(trait_id) = 'integer'),
name VARCHAR UNIQUE NOT NULL CHECK(TYPEOF(name) = 'text'),
uri VARCHAR UNIQUE NOT NULL CHECK(TYPEOF(uri) = 'text'),
PRIMARY KEY (trait_id)
);
DROP TABLE IF EXISTS trait_categories;
CREATE TABLE trait_categories(
trait_category_id INTEGER UNIQUE NOT NULL CHECK(TYPEOF(trait_category_id) = 'integer'),
efo_id VARCHAR UNIQUE NOT NULL CHECK(TYPEOF(efo_id) = 'text'),
name VARCHAR UNIQUE NOT NULL CHECK(TYPEOF(name) = 'text'),
uri VARCHAR UNIQUE NOT NULL CHECK(TYPEOF(uri) = 'text'),
PRIMARY KEY (trait_category_id)
);
DROP TABLE IF EXISTS trait_categories_traits;
CREATE TABLE trait_categories_traits(
trait_category_id INTEGER NOT NULL CHECK(TYPEOF(trait_category_id) = 'integer'),
trait_id INTEGER NOT NULL CHECK(TYPEOF(trait_id) = 'integer'),
FOREIGN KEY (trait_category_id) REFERENCES trait_categories(trait_category_id),
FOREIGN KEY (trait_id) REFERENCES traits(trait_id)
);
Here is my SELECT which fails:
SELECT trait_categories.name, traits.name
FROM trait_categories JOIN trait_categories_traits ON trait_categories_traits.trait_category_id = trait_categories.trait_category_id
JOIN traits.trait_id ON trait_categories_traits.trait_id = traits.trait_id;
SQLite say:
sqlite> select trait_id from traits limit 1;
663
sqlite> SELECT trait_categories.name, traits.name
...> FROM trait_categories JOIN trait_categories_traits ON trait_categories_traits.trait_category_id = trait_categories.trait_category_id
...> JOIN traits.trait_id ON trait_categories_traits.trait_id = traits.trait_id;
Error: no such table: traits.trait_id
Please help.
JOIN joins two tables, so it wants two table names.
But traits.trait_id is not a table name.
It appears you wanted to join the traits table, so remove the .trait_id. (And when both columns have the same name, using USING is simpler.)
SELECT ...
FROM trait_categories
JOIN trait_categories_traits USING (trait_category_id)
JOIN traits USING (trait_id);
Create SOF.SQL
CREATE TABLE "android_metadata" ("locale" TEXT DEFAULT 'en_US');
INSERT INTO "android_metadata" VALUES ('en_US');
CREATE TABLE main.t_def (
_id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
word TEXT(20) not null,
word_def TEXT(20) not null
);
insert into t_def (word, word_def) values ('ball','spherical object');
insert into t_def (word, word_def) values ('cat','feline');
insert into t_def (word, word_def) values ('dog','common housekept');
CREATE TABLE main.t_a (
_id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
corr_answer TEXT(20) not null,
user_answer TEXT(20) not null,
is_correct INTEGER not null
);
insert into t_a (user_answer, corr_answer, is_correct) values ('ball','cat',0);
insert into t_a (user_answer, corr_answer, is_correct) values ('dog','dog',1);
.exit
Then run:
sqlite3 foo.db < SOF.SQL
I want a result set that is:
ball|spherical object|cat|feline|0
This is the closest I have gotten:
select t_def.word, t_def.word_def from t_def, t_a where t_a.is_correct=0 and t_a.corr_answer=t_def.word;
To get values from two rows, you need two instances of the table:
SELECT t_a.user_answer,
user_def.word_def AS user_word_def,
t_a.corr_answer,
corr_def.word_def AS corr_word_def,
t_a.is_correct
FROM t_a
JOIN t_def AS user_def ON t_a.user_answer = user_def.word
JOIN t_def AS corr_def ON t_a.corr_answer = corr_def.word
WHERE NOT t_a.is_correct