Is the following a GUID - guid

Is the following a valid GUID?
202008210743518

A guid is just a 128 bit number, so sure you could interpret it as a guid, just with a lot of 0 padding. They are more commonly expressed as a 32 character hex string though.
00000000-0000-0000-0000-B7B9B3A4A0DE
(which is your seemingly-decimal number expressed in a 32-bit hex string)

No, it's not. A GUID is 32 hex digits, typically written in groups of 8-4-4-4-12 digits.

A GUID is a Globally Unique Identifier. Typically these are identifiers that combine a number of different factors, such as the current time, plus some ranom component, so that they are very likely to be unique, but this is not guaranteed!
You can create GUIDs using your own mechanism if you like, however, most people use the Universal Unique Identifier standard, which when written in Hex looks like this:
21EC2020-3AEA-1069-A2DD-08002B30309D
If the number you give is supposed to be a GUID, it might work OK, but I would be a bit suspicious as to just how unique it is!

A GUID looks like this: {21EC2020-3AEA-1069-A2DD-08002B30309D}. See http://en.wikipedia.org/wiki/Globally_unique_identifier.

Related

How to set DynamoDB range key, String or Map

I have a DynamoDB table with a primary hash key, and a range key. Range key will have two attributes. Say those attribute names are: name1, name2, with values value1, value2
Plan A: combine two attributes as string, use comma as delimiter
Primary hash key: id
Range key: value1,value2
Cons
1. comma may not work if some wired values contain this delimiter
Plan B: convert map as String for range key
Primary hash key: id
Range key: “{\“name1\”: \“value1\”, \“name2\”: \“value2\”}”
Cons
1. different SDK may result into different JSON String based on the same value? (Not sure), need to support multiple SDK read/write. Like Java and Ruby
So, which solution works better? Or there are any better suggestions?
Thanks!
Ray
You're on the right track. The AWS docs regarding key design promote your first suggestion, but it also has some warnings about the situation that you refered as cons.
I don't think that you could have problemas with different sdk parsers, but I also think that a little bit of precautions here would be a good ideia. So instead of directly parse a json to string using the sdk, I would manually concatenate the values using a custom function to generate a deterministic value like "name1-value1-name2-value2" or "name1:value1-name2:value2".

How is an MVCCKey formed in CockroachDB?

I want to create a MVCCKey with a timestamp and pretty value I know. But I realize a roachpb.key is not very straightforward; is there some prefix/suffix involved? Is the database name is also encoded in roachpb.key?
Can anyone please tell me how a MVCCKey is formed? What information does it have? In the documentation, it just says that it looks like /table/primary/key/column.
An engine.MVCCKey combines a regular key with a timestamp. MVCCKeys are encoded into byte strings for use as RockDB keys (RocksDB is configured with a custom comparator so MVCCKeys are sorted correctly even though the timestamp uses a variable-width encoding).
Regular keys are byte strings of type roachpb.Key. For ordinary data records, the keys are constructed from table, column, and index IDs, along with the values of indexed columns. (The database ID is not included here; the database to which a table belongs can be found in the system.descriptors table)
The function keys.PrettyPrint can convert a roachpb.Key to a human-readable form.

DataColumn.Expression RowFilter on Dataview

Maybe this is stupid question or maybe I have designed my code completely wrong but anyhow, here is my question...
I have a "dynamic" sql-query where its impossible to take all the parameters i need for making the query parameterized, therefore i get my data and put it in a dataview and after that i search for the rows I want to show in the dataview.
One of the columns are a column named id. Id is primary key and auto_increment in the table and therefore it's an int.
Now to my question, i want to present all my matching id with the number the user put in my textbox. Let us say my id consist of 5 numbers and the user put the 4 first, then in the perfect world i would have 10 matches (12340-12349 as an example). Doing this on a string is very easy using RowFilter and the operator LIKE combined with a wildcard. But how can i do something similar on integers? Do i have to convert it to strings and wont that ruin the rowfilter expression?
Not a live or death-situation... im more curious if the ice im walking is very thin... :)
Rowfilter expression supports CONVERT function, so technically you can convert your integer ID to string to do the LIKE command:
MyDataView.RowFilter = "Convert(ID, 'System.String') LIKE '1234*'";
But do try to offload the filtering to backend. It's unlikely that you have unlimited number of parameters and SQL is very flexible in allowing you different combinations.

What exactly are hashtables?

What are they and how do they work?
Where are they used?
When should I (not) use them?
I've heard the word over and over again, yet I don't know its exact meaning.
What I heard is that they allow associative arrays by sending the array key through a hash function that converts it into an int and then uses a regular array. Am I right with that?
(Notice: This is not my homework; I go too school but they teach us only the BASICs in informatics)
Wikipedia seems to have a pretty nice answer to what they are.
You should use them when you want to look up values by some index.
As for when you shouldn't use them... when you don't want to look up values by some index (for example, if all you want to ever do is iterate over them.)
You've about got it. They're a very good way of mapping from arbitrary things (keys) to arbitrary things (values). The idea is that you apply a function (a hash function) that translates the key to an index into the array where you store the values; the hash function's speed is typically linear in the size of the key, which is great when key sizes are much smaller than the number of entries (i.e., the typical case).
The tricky bit is that hash functions are usually imperfect. (Perfect hash functions exist, but tend to be very specific to particular applications and particular datasets; they're hardly ever worthwhile.) There are two approaches to dealing with this, and each requires storing the key with the value: one (open addressing) is to use a pre-determined pattern to look onward from the location in the array with the hash for somewhere that is free, the other (chaining) is to store a linked list hanging off each entry in the array (so you do a linear lookup over what is hopefully a short list). The cases of production code where I've read the source code have all used chaining with dynamic rebuilding of the hash table when the load factor is excessive.
Good hash functions are one way functions that allow you to create a distributed value from any given input. Therefore, you will get somewhat unique values for each input value. They are also repeatable, such that any input will always generate the same output.
An example of a good hash function is SHA1 or SHA256.
Let's say that you have a database table of users. The columns are id, last_name, first_name, telephone_number, and address.
While any of these columns could have duplicates, let's assume that no rows are exactly the same.
In this case, id is simply a unique primary key of our making (a surrogate key). The id field doesn't actually contain any user data because we couldn't find a natural key that was unique for users, but we use the id field for building foreign key relationships with other tables.
We could look up the user record like this from our database:
SELECT * FROM users
WHERE last_name = 'Adams'
AND first_name = 'Marcus'
AND address = '1234 Main St'
AND telephone_number = '555-1212';
We have to search through 4 different columns, using 4 different indexes, to find my record.
However, you could create a new "hash" column, and store the hash value of all four columns combined.
String myHash = myHashFunction("Marcus" + "Adams" + "1234 Main St" + "555-1212");
You might get a hash value like AE32ABC31234CAD984EA8.
You store this hash value as a column in the database and index on that. You now only have to search one index.
SELECT * FROM users
WHERE hash_value = 'AE32ABC31234CAD984EA8';
Once we have the id for the requested user, we can use that value to look up related data in other tables.
The idea is that the hash function offloads work from the database server.
Collisions are not likely. If two users have the same hash, it's most likely that they have duplicate data.

Declaring data types in SQLite

I'm familiar with how type affinity works in SQLite: You can declare column types as anything you want, and all that matters is whether the type name contains "INT", "CHAR", "FLOA", etc. But is there a commonly-used convention on what type names to use?
For example, if you have an integer column, is it better to distinguish between TINYINT, SMALLINT, MEDIUMINT, and BIGINT, or just declare everything as INTEGER?
So far, I've been using the following:
INTEGER
REAL
CHAR(n) -- for strings with a known fixed with
VARCHAR(n) -- for strings with a known maximum width
TEXT -- for all other strings
BLOB
BOOLEAN
DATE -- string in "YYYY-MM-DD" format
TIME -- string in "HH:MM:SS" format
TIMESTAMP -- string in "YYYY-MM-DD HH:MM:SS" format
(Note that the last three are contrary to the type affinity.)
I would recommend not using self-defined types. I have observed in version 3.5.6 that types not already defined could sometimes cause an INSERT command to be refused. Maybe 1 out of 1000. I don't know if this was addressed since.
In any case, there is no sizing advantage in typing a column TINYINT or SMALLINT. The only advantage would be outside SQLite, for either parsing your column types with another program or to satisfy your personal need for tidiness. So I strongly recommend using the base types defined by SQLite and sticking to those.
Since SQLite is typeless, use whatever types make it easier for you to see what the schema looks like. Or you can match the types to your codebase.
I'm going to go with Kevin on this one. In short, knock yourself out. Make up brand new areas of mathematics if it suits your schema. Use the classnames of your ORM. Or name every type (except the PRIMARY KEY INTEGER ones) for ex-girlfriends. In the end SQLite is more about how you access and use the data.

Resources