BerkeleyDB duplicate data items as append-only log? - berkeley-db

I'm looking into using BerkeleyDB Java Edition for a project. I've only read some of the documentation so far (not written any code) but it looks like a good match.
One of the functions I would like is an append-only log for a particular key. e.g.
«my key» => «snapshot 1»
=> «snapshot 2»
=> «snapshot 3»
The Duplicate Data items documentation looks like if I set the DB_DUP flag I can write a number of items for a key (in configurable order) and then retrieve them with a cursor.
Is this a sensible / suitable use for BerkeleyDB?
(I do have other reasons for wanting to use BerkeleyDB in the project, this isn't my primary use case. I am aware of all of the capabilities in Redis but in-memory isn't suitable)

You can certainly use Berkeley DB how you describe. It's a little more challenging than a straightforward key-value store, as there are number of additional flags to cursor operations that you'll need to pay attention to. It has the benefit that it only stores one copy of the key! But, in at least some version, you couldn't store the same data items if they were sorted. Just, you know, caveats.
If you're not concerned about the storage space of the keys, you might consider using a monotonically increasing number at the end of your key. Then, you could just use it like a simple key value store, inserting your records like this:
«my key.1» => «snapshot 1»
«my key.4» => «snapshot 2»
«my key.9» => «snapshot 3»
You'll still bring them in with a cursor, like you would have for duplicate data items. Just start your search at «my key.0» and terminate it when you see «my other key.x». My bet is that you'll be able to get something working with less head scratching.

Related

How do I read and write to the google cloud datastore?

The documentation in google cloud's datastore section is very confusing to me.
I was under the belief datastore may be used as a key-value storage. (Similar to mongoDB) But I guess I misunderstood how its keys work, since I can't use string keys outright, but I need to transform series of strings to a (list) key via some list => dataStore.key(list) transformation.
That's weird, and I don't understand why use a list instead of a string, and I don't understand why I don't use the list outright, and need to use datastore.key first, but I can do that. However, after playing with it a little bit, I discovered that the return value of datastore.key(list) would get me different values for the same list if I run it repeatedly!
So, now I need to somehow remember this key somewhere, but the reason I wanted to use datastore was that I was running in a service with no persistent memory to begin with.
Am I using datastore wrong? Can I use it at all for simple persistent key-value storage?
It appears that the issue was that I used a list that was too short.
The datastore expects collections, which contain documents, each of which is a key-value mapping. So instead of having my key point at a mapping, I needed to set a collection, and in it have keys mapped to mappings. (In other words, I needed to have one more element in the keys list)
results = await this.dataStore.get(this.dataStore.key([collectionId, documentId]));
AND
await this.dataStore.save({ key: this.dataStore.key([collectionId, documentId]), data: someKeyValueObject });

How do I query a Drupal Shortcut (8.x/9.x) by its URL or Menu Path?

We're rolling out an update to an in-house Drupal installation profile, and one of the menu paths that is used frequently is getting changed. Most of our installations reference that menu path in a shortcut (via the "Shortcut" module in core). In an update hook, we'd like to be able to query for those shortcuts and update them.
It feels like this should be straightforward, but for some reason we're finding it difficult to query for shortcuts by their url. We can query them by title, but that seems fragile (since the title could be different between installations, might be different by localization, etc.).
We tried the following, but this lead to the error message 'link' not found:
// This does NOT work.
$shortcuts_needing_update =
\Drupal::entityTypeManager()
->getStorage('shortcut')
->loadByProperties([
'link' => [
'internal:/admin/timeline-management',
],
]);
// This works, but is fragile.
$shortcuts_needing_update =
\Drupal::entityTypeManager()
->getStorage('shortcut')
->loadByProperties([
'title' => 'My shortcut',
]);
Based on the code in \Drupal\shortcut\Entity\Shortcut::baseFieldDefinitions() and \Drupal\shortcut\Controller\ShortcutSetController::addShortcutLinkInline() it's obvious that Shortcut entities have a property called link that can be set like an array containing a uri key, yet it does not seem possible to query by this property even though it's a base field.
Looking at the database, it appears that Drupal stores the URL in a database column called link__uri:
TL;DR That means that this works:
$shortcuts_needing_update =
\Drupal::entityTypeManager()
->getStorage('shortcut')
->loadByProperties([
'link__uri' => 'internal:/admin/old/path',
]
);
Read on if you want to know the subtle reason why this is the case.
Drupal's database layer uses pluggable "table mapping" objects to tell it how to map an entity (like a Shortcut) to one or more database tables and database table columns. The logic for generating a column name for a field looks like this in the default table mapping (\Drupal\Core\Entity\Sql\DefaultTableMapping):
As shown above, if a field indicates it allows "shared" table storage, and the field has multiple properties (uri, title, etc.), then the mapping flattens the field into distinct columns for each property, prefixed by the field name. So, a Shortcut entity with link => ['uri' => 'xyz']] becomes the column link__uri with a value of xyz in the database.
You don't see this often with entities like nodes, which is why this seems strange here. I'm usually accustomed to seeing a separate database table for things like link fields. That's because nodes and other content entities don't usually allow shared table storage for their fields.
How does the mapping determine if a field should use shared table storage? That logic looks like this:
So, the default table mapping will use shared table storage for a field only under specific circumstances:
The field can't have a custom storage handler (checks out here since shortcuts don't provide their own storage logic).
The field has to be a base field (shortcuts are nothing without a link, so that field is defined as a base field as mentioned in the OP).
The field has to be single-valued (checks out -- shortcuts have only one link).
The field must not have been deleted (checks out; again, what is a shortcut without a link field?).
This specific set of circumstances aren't often satisfied by nodes or other content entities, which is why it's a bit surprising here.
We can confirm this by using Devel PHP to ask the table mapping for shortcuts directly, with code like the following:
$shortcut_table_mapping =
\Drupal::entityTypeManager()
->getStorage('shortcut')
->getTableMapping();
$efm = \Drupal::service('entity_field.manager');
$storage_definitions = $efm->getFieldStorageDefinitions('shortcut');
$link_storage_definition = $storage_definitions['link'];
$has_dedicated_storage = $shortcut_table_mapping->requiresDedicatedTableStorage($link_storage_definition);
$link_column = $shortcut_table_mapping->getFieldColumnName($link_storage_definition, 'url');
dpm($has_dedicated_storage, 'has_dedicated_storage(link)');
dpm($link_column, 'link_column');
This results in the following:

SQLite | WebSQL - Grouped double ordering? That sounds wrong

I have no idea what to call what I'm trying to do, but I can explain it quite well. I have two tables with the following structure in my WebSQL database. This is being used in my mobile application (Hybrid app) to keep storage of user messages. These are my tables:
message_threads [ thread_id, user_id, last_seen, last_active ]
messages [ message_id, thread_id, message_type, message_content, message_date ]
I already have all of the logic handled for adding messages to the database, and creating new message threads, but the problem I have is ordering them when trying to retrieve them.
I would like to order my results (of messages) by the last_active field of the message_thread with the corresponding thread_id. However, now that I've done that, I also want to order all of the messages for that thread by the message's message_date field.
So, basically, I want to group all of my messages by their thread, order the thread by the last_active field, then order the messages inside of that thread by the message_date field. I assume there's a way to do this in SQLite, and if not I can just do a lot of loop logic on the front-end, which really won't hurt anything, but it's always nice to know the tricks of the query language.

Why does DynamoDB require expressionAttributeValue?

I'm learning about how to filter results from a scan or query using Amazon's DynamoDB. I would expect an example filter to look like filter => name = Bob or some such. However, Amazon requires the use of a expression attribute such as filter => name = :person and then ExpressionAttributeValues => { ":person": {"S": "Bob"}}
This is confusing and hurts my head, why can't I use the simple name = Bob?
Official docs: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/QueryAndScan.html#FilteringResults
Apparently working example near end: https://github.com/aws/aws-cli/issues/1073
This type of syntax follows an approach that is similar to prepared statements that are used in SQL systems. This was a design decision that the DynamoDB team at AWS made. One of the reasons is to allow fields that conflict with the lengthy list of reserved words (including 'name' that you were using in your example) that are defined by DynamoDB.
Avoiding reserved words is actually performed by using the ExpressionAttributeNames attribute and specifying the attribute names. You were referencing ExpressionAttributeValues which is where the list of values is specified. More information is available on the Using Placeholders for Attribute Names and Values documentation page.
Another motivation of this design is to separate the statement from the parameter names and values, similar to prepared statements in SQL as I've already mentioned. While this may seem odd at first it has the added benefit of effectively sanitizing your inputs in a NoSQL sense avoiding possible malicious or unintentional problems with your user input affecting the behavior of your request on the interaction with DynamoDB.

Statically referencing values in doctrine

Often I need to lookup certain rows from the database, a simplistic example would be creating a new user and referencing 'Mr' from a table defining possible salutations.
Peppering the code with static references to a database value would seem really bad, i.e.:
$em->getRepository('Salutations')->findOneBy(array('name' => 'Mr'));
So instead I create a constant in the Salutations entity, i.e.:
$em->getRepository('Salutations')->findOneBy(array('name' => Salutations::MR);
This at least limits some database changes to only affecting one file but does not seem ideal. Is there a better way to statically reference database values?
Constants are great for this, if you need to save time typing and avoid typos.
When I need to control used values, e.g. not to allow "Misses", I use ManyToOne association to another entity, e.g. Salutation.

Resources