Sqlite sequence not associated with an auto-increment column - sqlite

I've got a situation where I need to generate a unique id for use across multiple tables, something like tables A, B, C each having a uniqueId field. For business reasons, the id's have to be unique across all tables, but because of the multiple tables, I can't use an auto-increment column.
Is there a way to make a sequence that can be shared like this?
(I know I could just make the uniqueId column TEXT and store a GUID in there, and that would be unique, but I may have a LOT of these, and I'd rather use a 4 byte integer than a 32 byte GUID for this. Is there a more compact way to generate a non-conflicting identifier in sqlite?)

Traditionally you'd use a sequence; just an auto-incrementing counter. Unfortunately, SQLite doesn't support sequences.
Use a Universally Unique Identifier, a UUID. UUIDv4 is just a 128 bit random number. Generate it in your program and insert it; preferably insert it as a 128 bit value, not a as string.

Create another table with just an autoinc column (and maybe one other column, if SQLite won't let you have just one?), and triggers for inserts on the other tables that:
First inserts a row in this "fake-sequence" table
Then fetches the last inserted row's id from that table
And finally inserts that "fake-sequence-table"-generated value into the global-id columns of the other tables.
Should work -- if SQLite has triggers.

Related

SQLite: re-arrange physical position of rows inside file

My problem is that my querys are too slow.
I have a fairly large sqlite database. The table is:
CREATE TABLE results (
timestamp TEXT,
name TEXT,
result float,
)
(I know that timestamps as TEXT is not optimal, but please ignore that for the purposes of this question. I'll have to fix that when I have the time)
"name" is a category. This calculation holds the results of a calculation that has to be done at each timestamp for all "name"s. So the inserts are done at equal-timestamps, but the querys will be done at equal-names (i.e. I want given a name, get its time series), like:
SELECT timestamp,result WHERE name='some_name';
Now, the way I'm doing things now is to have no indexes, calculate all results, then create an index on name CREATE INDEX index_name ON results (name). The reasoning is that I don't need the index when I'm inserting, but having the index will make querys on the index really fast.
But it's not. The database is fairly large. It has about half a million timestamps, and for each timestamp I have about 1000 names.
I suspect, although I'm not sure, that the reason why it's slow is that every though I've indexed the names, they're still scattered all around the physical disk. Something like:
timestamp1,name1,result
timestamp1,name2,result
timestamp1,name3,result
...
timestamp1,name999,result
timestamp1,name1000,result
timestamp2,name1,result
timestamp2,name2,result
etc...
I'm sure this is slower to query with NAME='some_name' than if the rows were physically ordered as:
timestamp1,name1,result
timestamp2,name1,result
timestamp3,name1,result
...
timestamp499997,name1000,result
timestamp499998,name1000,result
timestamp499999,name1000,result
timestamp500000,namee1000,result
etc...
So, how do I tell SQLite that the order in which I'd like the rows in disk isn't the one they were written in?
UPDATE: I'm further convinced that the slowness in doing a select with such an index comes exclusively from non-contiguous disk access. Doing SELECT * FROM results WHERE name=<something_that_doesnt_exist> immediately returns zero results. This suggests that it's not finding the names that's slow, it's actually reading them from the disk.
Normal sqlite tables have, as a primary key, a 64-bit integer (Known as rowid and a few other aliases). That determines the order that rows are stored in a B*-tree (Which puts all actual data in leaf node pages). You can change this with a WITHOUT ROWID table, but that requires an explicit primary key which is used to place rows in a B-tree. So if every row's (name, timestamp) columns make a unique value, that's a possibility that will leave all rows with the same name on a smaller set of pages instead of scattered all over.
You'd want the composite PK to be in that order if you're searching for a particular name most of the time, so something like:
CREATE TABLE results (
timestamp TEXT
, name TEXT
, result REAL
, PRIMARY KEY (name, timestamp)
) WITHOUT ROWID
(And of course not bothering with a second index on name.) The tradeoff is that inserts are likely to be slower as the chances of needing to split a page in the B-tree go up.
Some pragmas worth looking into to tune things:
cache_size
mmap_size
optimize (After creating your index; also consider building sqlite with SQLITE_ENABLE_STAT4.)
Since you don't have an INTEGER PRIMARY KEY, consider VACUUM after deleting a lot of rows if you ever do that.

How to make values unique in cassandra

I want to make unique constraint in cassandra .
As i want to all the value in my column be unique in my column family
ex:
name-rahul
phone-123
address-abc
now i want that i this row no values equal to rahul ,123 and abc get inserted again on seraching on datastax i found that i can achieve it by doing query on partition key as IF NOT EXIST ,but not getting the solution for getting all the 3 values uniques
means if
name- jacob
phone-123
address-qwe
this should also be not inserted into my database as my phone column has the same value as i have shown with name-rahul.
The short answer is that constraints of any type are not supported in Cassandra. They are simply too expensive as they must involve multiple nodes, thus defeating the purpose of having eventual consistency in first place. If you needed to make a single column unique, then there could be a solution, but not for more unique columns. For the same reason - there is no isolation, no consistency (C and I from the ACID). If you really need to use Cassandra with this type of enforcement, then you will need to create some kind of synchronization application layer which will intercept all requests to the database and make sure that the values are unique, and all constraints are enforced. But this won't have anything to do with Cassandra.
I know this is an old question and the existing answer is correct (you can't do constraints in C*), but you can solve the problem using batched creates. Create one or more additional tables, each with the constrained column as the primary key and then batch the creates, which is an atomic operation. If any of those column values already exist the entire batch will fail. For example if the table is named Foo, also create Foo_by_Name (primary key Name), Foo_by_Phone (primary key Phone), and Foo_by_Address (primary key Address) tables. Then when you want to add a row, create a batch with all 4 tables. You can either duplicate all of the columns in each table (handy if you want to fetch by Name, Phone, or Address), or you can have a single column of just the Name, Phone, or Address.

How to use one sequence for all table in sqlite

When I'm creating tables in an SQLite database, separate sequences are automatically created for each table.
However I want to use one sequence for all tables in my SQLite database and also need to set min and max values (e.g. min=10000 and max=999999) of that sequence (min and max means start value of sequence and maximum value that sequence can increment).
I know this can be done in an Oracle database, but don't know how to do it in SQLite.
Is there any way to do this?
Unfortunately, you cannot do this: SQLite automatically creates sequences for each table in special sqlite_sequence service table.
And even if you somehow forced it to take single sequence as source for all your tables, it would not work the way you expect. For example, in PostgreSQL or Oracle, if you set sequence to value say 1 but table already has filled rows with values 1,2,..., any attempt to insert new rows using that sequence as a source would fail with unique constraint violation.
In SQLite, however, if you manually reset sequence using something like:
UPDATE sqlite_sequence SET seq = 1 WHERE name = 'mytable'
and you already have rows 1,2,3,..., new attempts to insert will NOT fail, but automatically assign this sequence maximum existing value for appropriate auto-increment column, so it can keep going up.
Note that you can use this hack to assign starting value for the sequence, however you cannot make it stop incrementing at some value (unless you watch it using other means).
First of all, this is a bad idea.
The performance of database queries depends on predictability, and by fiddling with the sequence of indexes you are introducing offsets which will confuse the database engine.
However, to achieve this you could try to determine the lowest sequence number which is higher than or equal to any existing sequence number:
SELECT MAX(seq) FROM sqlite_sequence
This needs to be done after each INSERT query, followed by an update of all sequences for all tables:
UPDATE sqlite_sequence SET seq=determined_max

SQLLite: How to renumber a auto-increment field?

I have two different CSV files which I have merged and imported into a single table in a SQLite3 database. Each CSV file contained a column called ID. Since, some of the ID's are duplicates (across the CSV files) and this is a primary key field, I need a way to completely renumber the ID field for each row in the table.
The ID field is also an auto-increment field.
So, what I would like to do is to run a SQL command or some other method where the ID for each row of the table would be reset to ensure uniqueness. For example, the ID field for the first row will be set to 1 the next to 2 and so on.
Note, it is not so important that it begin with 1. Ensuring primary key uniqueness is the goal here. It doesn't matter what number it starts at. There are also no foreign key relations so that is not an issue.
Any suggestions much appreciated.
Okay, in my case, I figured out that it was easiest to not import the ID column. Rather, I imported everything else and then added an ID field of type auto-increment. Once I did that, everything was re-numbered as I wanted.

SQLite - storing multiple values

how can I store and retrieve in SQLite database multiple values for the same row of the same column?
i.e. I have a product column and another column is stores, where I put in all the stores, where it is possible to get this product:
Product: iLamp;
Stores: River's; McWay; Lonnie's; ...
How can I implement this?
Thank you in advance.
If you're smart, you won't do this. Because when it comes time to figure out which stores stock the item, your queries will be hideously deformed. So will those of your stock control application when they try to insert and delete stores. What you'll end up with is what I like to call SQL gymnastics, spending more and more time trying to do SQL in the most bizarre way, simply due to a bad design choice.
Seriously, store these in different rows in the database, as Codd intended.
It's far easier (and faster in terms of the DBMS grunt) to combine multiple rows into a single semicolon-separated string than to break that string into elements.
A schema such as this would suffice:
Products:
ProdCode integer primary key
ProdDesc varchar(50)
Stores:
StoreCode integer primary key
StoreDesc varchar(50)
StockLevels:
StoreCode integer \
ProdCode integer / primary key
Count integer
like others have mentioned, you could store it as a comma separated string, and then put that in your database, but like ocdecio mentioned, the way you have your tables right now is bad a design. You should consider doing something like adding another table, like a PRODUCT_TO_STORE table that has two columns, one has the key of a product, and the other column has the key to a store. Then there is a relationship between products and stores, and helps normalize your data, which in most cases, is a good thing. Then when you need to find all the stores a product is in you could just perform a query on that PRODUCT_TO_STORE table.
I would use something simple like JSON. However this is bad db design since it is not normalized.
change the schema.
do not store multiple values in the same row's column.
add a new table where the multiple values can be stored in their own rows
bad table deisgn:
parents:
ParentID, values, other columns
good table design:
parent
parentID, other columns
child
parentID, childID, value
Either store as a comma-separated list for each store or add multiple rows one for each pair "Store"-Product".

Resources