SqliteDataReader and duplicate column names when using LEFT JOIN - sqlite

Is there a documentation/specification about Sqlite3 that would describe would is supposed to happen in the following case?
Take this query:
var cmd = new SqliteCommand("SELECT Items.*, Files.* FROM Items LEFT JOIN Files ON Files.strColName = Items.strColName");
Both Items and Files have a column name "strColName". If an entry exists in Files, it will be joined to the result, if not, it will be NULL.
Let's assume I always need the value of strColName, no matter if it is coming from Items or from Files. If I execute a reader:
var reader = cmd.ExecuteReader();
If there is a match in Files, reader["strColName"] will obviously contain the correct result because the value is set and it is the same in both tables. But if there wasn't a match in Files, will the NULL value of Files overwrite the non-NULL value of Items?
I'm really looking for some specification that defines how a Sqlite3 implementation has to deal with this case, so that I can trust either result.

SQLite has no problem returning multiple columns labelled with the same name.
However, the columns will always be returned in exactly the same order they are written in the SELECT statement.
So, when you are searching for "strColName", you will find the first one, from Items.
It is recommended to use explicit column names instead of * so that the order is clear, and you can access values by their column index, if needed (and you detect incompatible changes in the table structure).

Related

Use RocksDB to support key-key-value (RowKey->Containers) by splitting the container

Support I have key/value where value is a logical list of strings where I can append strings. To avoid the situation where inserting a single string item to the queue causing re-write the entire list, I'd using multiple key-value pairs to represent it.
Key -> metadata of the value such as length and subkey format
Key-l1 -> value of item 1 in list
Key-l2 -> value of item 2 in list
Key-ln -> the lastest value in the list
I'd override the key comparer in RocksDB such that sorting of Key-ln formatted key is sort Key part first and ln second (i.e. group by and sort by Key and within the same Key value sort by ln). This way, all the list items along with its root key and metadata are grouped together in sst during initial bulk insert and during later sst compaction.
Appending a new list item becomes (1) first read Key-metadata to get the current list size of n; 2) insert Key-l(n+1) with new value. Deleting list item works as it is for RocksDB by deleting Key-ln and update the metadata.
To ensure the consistency, (1) and (2) will be done inside a RocksDB transaction.
This design seems to be ok?
Now, if I want to add anther feature of TTL for entire key-value(list), I'd use TTL support already in RocksDB. My understanding is that TTL to remove expired item happens during compaction. However, such compaction is not done under a transaction. RocksDB doesn't know that Key-metadata and Key-ln entries are related. It is entirely possible that there is a time window where Key->metadata(root node) is deleted while child nodes of (Key-ln) is not deleted yet (or reverse order). If during this time window, someone reads or update the list, it will get an inconsistent for the Key-list. Any remedy for it?
Thanks
You should use Merge Operator, it's designed for such value append use case. Your design is read-before-write, which has performance penalty, in general it should be avoided if possible: What's read-before-write in NoSQL?.
Options options;
options.merge_operator.reset(new StringAppendOperator(','));
DB::Open(options, kDBPath, &db)
...
db->Merge(WriteOptions(), "key", "value1");
db->Merge(WriteOptions(), "key", "value2");
db_->Get(ReadOptions(), "key", &result); // return "value1,value2"
The above example uses a predefined StringAppendOperator, which simply append new values at the end. You can defined your own MergeOperator to customize the merge operation.
In the backend, the merge operation is done on the read path (and compaction to reduce the version number), details: Merge Operator Implementation.

Do not fail on missing column in a SQLLite query

I have a simple query like this:
SELECT * FROM CUSTOMERS WHERE CUSTID LIKE '~' AND BANKNO LIKE '~'
The problem is, the customers-table might or might not contain the BANKNO column depending on circumstances I've no control over. If however BANKNO is not a column in CUSTOMERS, this query fails.
So my question is: it is possible to test if the BANKNO column exists and if so, to include it in the query and if not to exclude this column?
The query really has to be flexible.
A non-existent column in a SELECT to sqlite3 will always fail.
One option might be to put the "full" sql in a try block, and if it errors, execute the other sql.
Or, you could query PRAGMA table_info('CUSTOMERS') and interrogate the result to see if a column in question is in the database. Find the sqlite doc here https://www.sqlite.org/pragma.html#pragma_table_info.
I'm sure there are other options, but the bottom line is you need to know before the sql is executed that it contains only valid column names.

Multiple inserts using 'User Defined Table Type' (asp.net sql-server)

In my ASP.NET web app I have a DataTable filled with data to insert into tblChildren.
The DataTable is passed to a stored procedure.
In the SP I need to read each row (e.i Loop through the DataTable), change a couple of columns in it (with accordance to the relevant data in tblParents) and only then insert the row into tblChildren.
SqlBulkCopy wouldn't do and I don't think TVP will do either (not sure... not too familiar with it yet).
Of course I can iterate through the DataTable rows in the app and send each one separately to the SP, but that would mean hundreds of round trips to the SqlServer.
I came across two possibilities that might achieve that : (1) Temp table
(2) Cursor.
The first is quite messy and the second, as I understand it, is NOT recommended)
Any guidance would be much appreciated.
EDIT :
I tried the approach of user-defined Table Type.
That works because I populate the Table Type (TT_Children) with values in the TT_Child_Family_Id column.
In real life, though, I will not know these values and I would need to loop thru #my_TT_Children and for each row get the value from tblFamilies, something like this :
SELECT Family_Id FROM tblFamilies WHERE Family_Name = TT_Child_Last_Name
(assuming there is always an equivalent for TT_Child_Last_Name in tblFamilies.Family_Name)
So my question is - how to loop through the table-type and for each row look up a value in a different table?
EDIT 2 (the solution) :
As in Amir's perfect answer, the stored procedure should look like this :
ALTER PROCEDURE [dbo].[usp_Z_Insert_Children]
#my_TT_Children TT_Children READONLY
AS
BEGIN
INSERT INTO tblChildren(Child_FirstName,
Child_LastName,
Child_Family_ID)
SELECT Cld.tt_child_FirstName,
Cld.tt_child_LastNAme,
Fml.Family_Id FROM #my_TT_Children Cld
INNER JOIN tblFamilies fml
ON Cld.TT_Child_LastName = Fml.Family_Name
END
Notes by Amir : column Family_Name in tblFamily must be unique and preferably indexed.
(Also I noticed that in case TT_Child_LastName does not have a match in tblFamilies, the row will not be inserted and I'll never know about it. That means that I have to check somehow if all rows were successfully processed).
You can join tblFamilies into the insert in the procedure and take the value from there. Much more efficient than looping through.
Or create a cursor and do one child at a time.
1) Make sure there is only one occurance of FamilyName in tblFamilies.
2) Make sure that if tblFamilies is a large table, then the FamilyName column is indexed.
INSERT INTO tblChildren(Child_FirstName, Child_LastName, Child_Family_ID)
SELECT Cld.tt_child_FirstName,
Cld.tt_child_LastNAme,
Fml.FamilyID
FROM #my_TT_Children Cld
INNER JOIN tblFamilies fml on Cld.TT_Child_LastName = Fml.FamilyName
But be aware that if tblFamilies has more than one entry per Family_Name, then this will duplicate the data. In this case you will need to add more restrictions in the where.

SQLite3 - Updating a single column in a multi-column database

I am creating a table that has a total of five columns. DUring the main "create" process, I only have enough data to populate four of the columns. Later in the execution of the program, I have the data for the fifth column. I start performing an "INSERT OR REPLACE". But! I only use the key column and the fifth column in the statement.
When I browse the database, columns two through four are NULL. So, the question is: Is there a way to only update a specific column (including the key) while keeping the existing data in tact?
INSERT OR REPLACE is the wrong statement, as you've discovered. Unless you can provide correct values for all the columns, an UPDATE statement is a better choice. Just update the column.
update your_table_name
set your_column_name = 'New Value'
where your_key_column = 'Something';
In many applications, more caution is called for.
update your_table_name
set your_column_name = 'New Value'
where your_key_column = 'Something'
and your_column_name is null;

Column ' ' in where clause is ambiguous Error in mysql

SELECT tbl_user.userid,
tbl_user.firstname,
tbl_user.lastname,
tbl_user.email,
tbl_user.created,
tbl_user.createdby,
tbl_organisation.organisationname
FROM tbl_user
INNER JOIN tbl_organisation
ON tbl_user.organisationid = tbl_organisation.organisationid
WHERE organisationid = #OrganisationID;
I am using this statement to do a databind. I am getting a error here.
Column 'OrganisationID' in where clause is ambiguous
What should I do is it wrong to name the OrganisationID in tbl_user same as tbl_organisation.
OrganisationID is a foreign key from tbl_Organisation
Since you have two columns with the same name on two different tables (and that's not a problem, it's even recommended on many cases), you must inform MySQL which one you want to filter by.
Add the table name (or alias, if you were using table aliases) before the column name. In your case, either
WHERE tbl_user.OrganisationID
or
WHERE tbl_Organisation.OrganisationID
should work.
You just need to indicate which table you are targeting with that statement, like "tbl_user.OrganisationID". Otherwise the engine doesn't know which OrganisationID you meant.
It is not wrong the have the same column names in two tables. In many (even most) cases, it is actually perferred.

Resources