I have two databases with the same structure. The first is the main one, while the second get updated periodically (in reality I have multiple "secondary" databases that I want to merge one by one into the main one).
The structure of the main and the secondary databases is identical.
I want to periodically dump all new values from the secondary database in the main one. However, the second time I do it, I want to exclude rows that were already copied the first time (and so on).
The tables in all these database have:
an ID column set as PRIMARY KEY going from 1 to N for each database (I suspect this was a mistake, but at the moment I can't change this)
a DATE column, representing a posix timestamp (float)
some other columns
My code looks like this:
ATTACH DATABASE secondary.db AS temp_db
DROP TABLE IF EXISTS my_table_temp
CREATE TABLE my_table_temp AS SELECT * FROM my_table
INSERT INTO main.my_table_temp SELECT * FROM temp_db.my_table
DELETE FROM my_table
INSERT INTO main.my_table SELECT DISTINCT * FROM main.my_table_temp ORDER BY date
DROP TABLE my_table_temp
the problem is that - I suspect due to the repeated ID column - the DISTINCT clause returns me:
UNIQUE constraint failed: my_table.id
However I don't care at all of the ID field that could also be dropped or reset.
NOTES:
the secondary databases are constantly updated by a code that - at the moment - I can't change
I initialize the "main" database copy-pasting one of the secondary to avoid regenerating the whole structure from scratch. Maybe there is a better way of doing this
Apologies if this is a naive question, but I'm very new with SQLite.
Thanks
Following the advice from #forpas, I solved this with the following code:
Assuming the columns to be id,date,col1 and col2
ATTACH DATABASE secondary.db AS temp_db
DROP TABLE IF EXISTS my_table_temp
CREATE TABLE my_table_temp AS SELECT date,col1,col2 FROM my_table
INSERT INTO main.my_table_temp SELECT date,col1,col2 FROM temp_db.my_table
DROP TABLE my_table /* I need to recreate my_table as I've removed a column*/
CREATE TABLE main.my_table AS SELECT DISTINCT date,col1,col2 FROM main.my_table_temp ORDER BY date
DROP TABLE my_table_temp
also, I automatized the extraction of the column names doing
SELECT name FROM PRAGMA_TABLE_INFO('my_table');
This is then passed to the python code running the script and the column id is removed from the list. Note that the second (and following) time I run this code, the column id won't be present in my_table to start with. However this approach allows the code to be the same in the two cases: either if the column id is there or not.
This procedure is then iterated over each table name to fully merge the two databases.
Related
The full-text search extensions of SQLite (FTS3, FTS4, FTS5) create shadow tables. E.g., if I create FTS5-table
CREATE VIRTUAL TABLE test USING fts5(textData);
that will also automatically create several real (non-virtual) tables, with test_content among them, which (apparently) stores the actual data inserted into the original virtual table test.
I know SQLite authors suggest that these shadow tables "should not be accessed directly by the user". But it's not clear from the docs if there are no any guarantee at all about behavior of these tables, or this advise concerns primarily attempts to directly INSERT or UPDATE on them. But what are the risks of reading from these tables?
Specifically - I need another (regular) table to have a FOREIGN KEY which references rowid of the virtual FTS5 table:
CREATE VIRTUAL TABLE test USING fts5(textData);
CREATE TABLE myTable (col1 INTEGER REFERENCES test(rowid));
I couldn't find hints on that in the docs, but my own experiments showed that foreign key just doesn't work here - I still can delete records from the test table even though they are referenced from myTable. However, if instead I do
CREATE VIRTUAL TABLE test USING fts5(textData);
CREATE TABLE myTable (col1 INTEGER REFERENCES test_content(id));
then everything seems to work as needed - I can't delete records from the virtual test table if they are referenced from myTable.
This works because, as follows from direct inspection, the rowid value of the test table is always equal to the id column value of the test_content table. Even when I specify rowid manually like in INSERT INTO test (rowid, value) VALUES (424242, 'foobar'); - then a new row appears in the test_content with the corresponding content and the corresponding id equal to 424242 (even if no preceding records were ever in the table).
So basically, my question is - is this correspondence between the rowid of an FTS5-table and the id of the corresponding <name>_content shadow table guaranteed to hold? Or might this break in some cases or in future versions?
I'm coming from SQL Server enviroment where you can declare a temp table with #table, but as I've read you can't do this in oracle.
I want get a value for 500.000 hardcoded id's from a table, but as the IN clause has a limit of 1000 I need to find another way. Is the best way to create a temporary table and insert the hardcoded values and then join the other table which contains the values I need ?
My client (toad) has autocommit set to off and I dont want to commit anything, I want it to be session-based so when I close the database client I want the temporary table do disappear. Is the code below the right way to do in oracle?
CREATE GLOBAL TEMPORARY TABLE Test(HardcodedId number(10))
ON COMMIT DELETE ROWS;
I've also tried to use inner join and in the join select the hardcoded values from dual, but this creates a column for each value and i'm not able to use a reference to join with. Is it possible to insert all values into a single column in dual?
You can use some thing like this (500 union all)
select * from (
select '1' from dual
union all
select '2' from dual
...) q
Then you can join this with other tables.
For your situation, I would use a GTT (global temporary table) - which you have already researched by the looks.
The advantage of a GTT is that it's a permanent object (so no need to constantly create and drop it) and the data "stored" in it is on a session basis.
suppose we have a file with just one table named TableA and this table has just one column named Text;
let say we populate our TableA with 3,000,000 of strings like these(each line a record):
Many of our patients are incontinent.
Many of our patients are severely disturbed.
Many of our patients need help with dressing.
if I save the file at this level it'll be: ~326 MB
now let say we want to increase the speed of our queries and therefore we set our Text column as the PrimaryKey(or create index on it);
if I save the file at this level it'll be: ~700 MB
our query:
SELECT Text FROM "TableA" where Text like '% home %'
for the table without index: ~5.545s
for the indexed table: ~2.231s
As far as I know when we create index on a column or set a column to be our PrimaryKey then sqlite engine doesn't need to refer to table itself(if no other column was requested in query) and it uses the index for query and hence the speed of query execution increases;
My question is in the scenario above which we have just one column and set that column to be the PrimaryKey too, then why sqlite holds some kind of unnecessary data?(at least it seems unnecessary!)(in this case ~326 MB) why not just keeping the index\PrimaryKey data?
In SQLite, table rows are stored in the order of the internal rowid column.
Therefore, indexes must be stored separately.
In SQLite 3.8.2 or later, you can create a WITHOUT ROWID table which is stored in order of its primary key values.
I need to extract all rows from every table that has the column imgAssetURL to add to a pre loading system.
I think in essence something like:
SELECT imgAssetURL FROM *
What are my options?
The definitions for all tables are located in the sqlite_master table. You would have to read those definitions, figure out which tables have the column in it, and run a query on each of those.
See http://www.sqlite.org/fileformat2.html#sqlite_master
Ok I have a sqlite db, that has roughly 100 rows. It is kind of a strange thing that I'm trying to do, but I need to insert a new row between each of the existing rows.
I have been trying to use the Insert statement as follows, but haven't had any luck:
insert into t1(column1) values("hello") where id%2 == 0
So I'm basically trying to use the %-operator to tell me if the id is even or odd. For every even id number, I'd like to insert a new row.
What am I missing? What can I do differently? How can I insert a new row into every other row and have the index updated as well?
Thanks
Your question assumes that the rows have some kind of built-in order to them, and that you can insert rows between other rows. That's not true.
It is true that rows have an order on disk, and that the id column is usually assigned in order, but that's an implementation detail. When you perform a query, the database is free to return the rows in any order it chooses, unless you specify what you want with an ORDER BY clause.
Now, I'm assuming what you really want is to insert rows between the existing rows in id order. One way to get what you want would look like this:
UPDATE t1 SET id = id * 2
INSERT INTO t1 (id, column) SELECT id+1, "hello" FROM t1
The UPDATE would double the ids of all the existing rows (so 1,2,3 becomes 2,4,6); then the INSERT would perform a query on t1 and use the result to insert a new set of rows with id values one more than the existing rows (so 2,4,6 becomes 3,5,7).
I haven't tested the above statements, so I don't know if they would work or if they require some extra trickery (like a temporary table) since we are querying and updating the same table in one statement. Also I may have made a syntax error.
Don't consider the rows as pre-ordered in the database. A database will store them as they come in, or according to an index. It's your task to order them on retrieval (i.e. when you query for data) according to your needs.