I have seen a lot of people using "accountName" but some people are using "account_name", Currently I think it makes more sense with "account_name" because it's not a variable it's just a field and it's more clear when I read a ton of string.
Is there any standard or specification for the NoSQL database fields name? If not which one do you prefer?
You can name your fields whatever you want but remember, Firebase always tries to map between the properties in the JSON data from the Realtime Database and the properties that exist in your classes using JavaBean naming conventions.
So according to the JavaBean property pattern, it makes more sense to use accountName rather than account_name.
Related
I have an object stored in the Firestore database. Among other keys, it has a userId of the user who created it. I now want to store an email address, which is a sensitive piece of info, in the object. However, I only want this email address to be retrieved by the logged in user whose userId is equal to the userId of the object. Is it possible to restrict this using Firebase rules? Or will I need to store that email address in a /private collection under the Firebase object, apply restrictive firebase rules, and then retrieve it using my server?
TL;DR: Firestore document reads are all or nothing. Meaning, you can't retrieve a partial object from Firestore. So there is no feature at rule level that will give you granularity to restrict access to a specific field. Best approach is to create a subcollection with the sensitive fields and apply rules to it.
Taken from the documentation:
Reads in Cloud Firestore are performed at the document level. You either retrieve the full document, or you retrieve nothing. There is no way to retrieve a partial document. It is impossible using security rules alone to prevent users from reading specific fields within a document.
We solved this in two very similar approaches:
As you suggested, you can move your fields to a /private collection and apply rules there. However, this approach caused some issues for us because the /private collection is completely dettached from the original doc. Solving references implied multiple queries and extra calls to FS.
The second option -which is what the Documentation suggests also, and IMHO a bit better- is to use a subcollection. Which is pretty much the same as a collection but it keeps a hierarchical relationship with the parent coll.
From the same docs:
If there are certain fields within a document that you want to keep hidden from some users, the best way would be to put them in a separate document. For instance, you might consider creating a document in a private subcollection
NOTE:
Those Docs also include a good step-by-step on how to create this kind of structure on FS, how to apply rules to them, and how to consume the collections in various languages
According to docs, the property id is special in Azure CosmosDB documents as it must always be set and have unique value per partition. Also it has additional restrictions on its content :
The following characters are restricted and cannot be used in the Id
property: '/', '\', '?', '#'
Obviously, this field is one of document "keys" (in addition to _rid) and used somehow in internal plumbing. Other than the restrictions above, it is unclear how exactly is this key used internally and more importantly for practitioners,which values constitute technically better ids than others?
Wild guess 1: For example, from some DB worlds, one would prefer short primary key values, since the PK would be included in index entries and shorter keys would allow more compact index for storage and lookup. Would id field length matter at all besides the one-time storage cost?
Wild guess 2: in some systems better throughput is achieved if common prefixes are avoided in names (i.e. azure storage container/blob names) and even suggest to add a small random hash as prefix. Does cosmosDB care about id prefix similarities?
Anything else one should consider?
EDIT: Clarification, I'm interested in what's good for the cosmosDB server storage/execution side, provided my data model is still in design and/or has multiple keys available the data designer can choose from.
First and foremost let's clear something out. The id property is NOT unique. Your collection can have multiple documents that have the exact same id. The id is ONLY unique within it's own logical partition.
That said, based on all the compiled info that we know from documentation and talks it doesn't really matter what value you choose to go with. It is a string and Cosmos DB will treat it as such but it is also considered as a "Primary key" internally so restrictions apply, such as ordering by it.
Where it does matter is in your consuming application's business logic. The id plays a double role of being both a CosmosDB property but also your property. You get to set it. This is the value you are going to use to make direct reads to the database. If you use any other value, then it's no longer a read. It's a query. That makes it more expensive and slower.
A good value to set is the id of the entity that is hosted in this collection. That way you can use the entity's id to read quickly and efficiently.
In the Firestore security rules you can access resource properties. I would like to use these properties in my queries, but I can't find any documentation on it.
Currently I am manually writing updatedAt timestamps into documents where I need them, but that is cumbersome and fragile, because it is easy to forget to update the timestamp. It also feels redundant, since the resource already has this data.
Is it, for example, possible to query all documents in a collection that have been updated since yesterday?
It is not possible to query on these, they are specific to the Security Rules layer.
While we can inspect the server update time for a specific document once retrieved, we cannot query for them since it is not indexed (and handled at a layer lower than our indexing engine).
I'm a MSSQL developer who recently was tasked with building a new application using DynamoDB since we use AWS and we wanted a highly scaleable database service.
My biggest concern is data integrity. For example, I have a table for all my users where every row needs to have a username, email, and name field, all strings, with a verified field that's an int. Is there anyway to require all entries in that table to have those fields and to be of that particular type?
Since the application is in PHP I'm using Kettle as my ORM which should prevent me from messing up the data integrity but another developer voiced a concern about if we ever add another application or if someone manually changes some types via the console.
https://github.com/inouet/kettle
Currently, no, you are responsible for maintaining the integrity of your items with respect to the existence of attributes that are not keys on the base table. However, you can use LSI and GSI to enforce data types of attributes (notwithstanding my qualm that this is not a recommended pattern, as it could cause partition heat especially for attributes whose range of values is small). For example, verified seems like it might take only 0 or 1 as a value, so if you create a GSI with PK=verified where verified is a Number, writes to the base table may get throttled by the verified GSI.
What is the best way to design the Domain objects which can have multi-lingual fields. An example can be a Product class with Description being multi-lingual.
I have found few links but could not decide which one is the best way.
http://fabiomaulo.blogspot.com/2009/06/localized-property-with-nhibernate.html
(This stores all localised language data in one field. Can be a problem if we query from Sql)
http://ayende.com/Blog/archive/2006/12/26/LocalizingNHibernateContextualParameters.aspx
(This one has a warning at the beginning that it is a hack and no longer supported)
http://www.webdevbros.net/2009/06/24/create-a-multi-languaged-domain-model-with-nhibernate-and-c/
(This does not describe how multilingual data will be structured in the database.)
Anyone having experience with using NHibernate with multi-lingual data. Is there a better way?
The third option looks great. The hibernate mapping is given, but not the database schema - if that's what you are missing, then I'll sketch it out here:
dictionary
----------
ID: int - identity
name: nvarchar(255)
phrase
------
dictionary_id:int (fkey dictionary.ID)
culture_id:int (LCID)
phrase:nvarchar(255) - this is the default size - seems too small
According to this blog entry, 255 is the default string length for String values. To overcome the short string length on the phrase text, you can change the <element> tag to
<element column="phrase" type="String" length="4001"></element>
To use this in your domain model, you add a PhraseDictionary property to your entity where you want translatable text. E.g. the title property or decription property.
I think the article describes a great approach, and is the one that I would go
for.
EDIT: In response to the comments, make the length less than 4001 if you know the absolute maximum size is less than that, as this will typically be faster. Also, NHibernate will lazily fetch the collection, but it may fetch all the items at once. You can profile to determine if this has any performance implications. (If you have only a handful of languages then I doubt you will see a difference.) If you have many languages (Say 50+) then it may be worthwhile creating custom properties to fetch the localized text. These will issue queries to fetch specifically the text required. More importantly, you may be able to fetch all the text for a given entity in one query, rather than each localized text property as a separate query.
Note that this extra effort is only needed if profiling gives you reason to be concerned about the performance. Chances are that the implementation in the article as is will function more than adequately.
I only have experience for Hibernate, but since nHibernate is so similar:
One option is to define a component type MultilingualString with members for each language (this assumes the set of languages is known at coding time). This type is also a convenient location to place an getter for the string by language id.
class MultiLingualString {
String english;
String chinese;
String klingon;
String forLanguage(Language lang) {
switch (lang) {
// you can guess what goes here
}
}
}
This results in the strings for all languages being stored in separate columns in the database while the representation in the object world retains fine granularity.
The advantage is that no join is required to fetch the strings. On the other hand, the only way not to fetch a string with this approach is to use a projection, which is a severe limitation if the strings are large, numerous and rarely needed.
If you do this a lot, writing a UserType might be worth it.
From a strictly database oriented standpoint with SQL Server, you should have one table with all of the base data (record key, dates, numbers, etc) and one table with all of the translatable string data. Let call the two tables Base and Base_Description.
Base ensures that there is a single key for each record, the key might be a string or auto-generated id depending on your particular use case.
The Base_Description table is related to the Base table, but also contains a value to select the language that the data is in. In my projects we use the langid column from sys.languages because we can set the language of the connection with and then grab it with ##LANGID for most operations.
In our testing we found this to be significantly faster than having multiple fields for each language, it also allows you to add other languages more easily. We are also using SQL Server Full-Text indexing and it fully works with this method. You should index in the neutral language and then you can pick the language to search against at run time (also filtering against the LangID column in Base_Description).
Do your requirements include the domain objects actually having multiple-language properties in the same object? And, if so, is it unlimited translations stored in the object (in a collection, say - in which case I would say that it would need to be just like any master/detail or parent/child collection) or fixed translations, in which case the languages (and thus the mapping to results of a stored proc or whatever) have to be determined statically anyway?
In many internationalized applications I worked on, the data was in only one language - customer names, the product names (there was no point in mapping even identical products used in one country to products in another, they all had different distributors and different SKUs, and of course localized pricing). The interface was also only in one language (at a time). So all the domain objects only required one language at a time. Thus the language of the translation would be determined when the object was instantiated.
We had translation user interfaces which allowed users to update the translated texts, but these only required two languages at a time (local and the default). I can see this being closest to what you are talking about. I guess that you would have child collections for each translatable property with all the possible translations in the collection. This would probably be closest to the second solution in the third article you linked. Of course, at this point you would also need to see if you want eager/lazy loading etc.