Firestore rules: maximum of 1000 expressions to evaluate has been reached - firebase

I added a new functionality to my application and now I'm getting this error:
maximum of 1000 expressions to evaluate has been reached
raised by firestore rules.
Because of it seems there are much less 1000 controls to be done for that specific write, I would like to ask some suggestions to avoid this behaviour or to introduce some vulnerabilities.
are logical expressions short-circuited?
what is defined as expression?
will this limit be extended?
Do you have any advice to avoid this problem?

Logical expression are short circuited.
An expression is anything that evaluates to some value. For example, true is one expression. false || false is three expressions.
There is no roadmap to change the limit. The limit is in place in order to prevent excessive resources being used by every operation. Bear in mind also that security rules are free (except for document access), and there is always going to be strict limits on what is provided for free.
Since we can't see your rules, it's not really possible to recommend exact advice. You should consider using functions to cut down on the number of expressions evaluated for some access. If you find yourself typing things like request.resource.data.foo a lot, consider using a function and pass request.resource.data to it to extract values rather than evaluating request.resource.data repeatedly, which is three expressions.

Related

Is there a workaround for the Firebase Query "NOT-IN" Limit to 10?

I saw a similar question here: Is there a workaround for the Firebase Query "IN" Limit to 10?
The point now is, with the query in, the union works, but with the query
not-in it will be intersection and give me all the documents, anyone knows how to do this?
As #samthecodingman mentioned, it's hard to provide specific advice without examples / code, but I've had to deal with this a few times and there are a few generalized strategies you can take:
Restructure your data - There's no limit on the number of equality operators you can use You can use up to 100 equality operators, so one possible approach is to store your filters/tags as a map, for example:
id: 1234567890,
...
filters: {
filter1: true,
filter2: true,
filter3: true,
}
If a doc doesn't have a particular tag, you could simply omit it, or you could set it to false, depending on your use case.
Note, however, that you may need to create composite indexes if you want to combine equality operators with inequality operators (see the docs). If you have too many filters, this will get unwieldy quickly.
Query everything and cache locally - As you mentioned, fetching all the data repeatedly can get expensive. But if it doesn't change too often or it isn't critical to get the changes in real time, you can cache it locally and refresh at some interval (hourly or daily, for example).
Implement Full-Text Search - If neither of the previous options will work for you, you can always implement full-text search using one of the services Firebase recommends like Elastic. These are typically far more efficient for use-cases with a high number of tags/filters, but obviously there's an upfront time cost for setup and potentially an ongoing monetary cost if your usage is higher than the free tiers these services offer.

How to OR Query for contains in Dynamoose?

I want to search(query) a bunch of strings from a column in DynamoDB. Using Dynamoose https://github.com/dynamoose/dynamoose
But it returns nothing. Can you help if this type of query is allowed or is there another syntax for the same.
Code sample
Cat.query({"breed": {"contains": "Terrier","contains": "husky","contains": "wolf"}}).exec()
I want all these breeds , so these are OR queries. Please help.
Two major things here.
First. Query in DynamoDB requires that you search for where a given hasKey that is equal to something. This must be either the hashKey of the table or hashKey of an index. So even if you could get this working, the query will fail. Since you can't do multiple equals for that thing. It must be hashKey = _______. No or statements or anything for that first condition or search.
Second. Just to answer your question. It seems like what you are looking for is the condition.in function. Basically this would change your code to look like something like:
Cat.query("breed").in(["Terrier", "husky", "wolf"]).exec()
Of course. The code above will not work due to the first point.
If you really want to brute force this to work. You can use Model.scan. So basically changing query to scan` in the syntax. However, scan operations are extremely heavy on the DB at scale. It looks through every document/item before applying the filter, then returning it to you. So you get no optimization that you would normally get. If you only have a handful or couple of documents/items in your table, it might be worth it to take the performance hit. In other cases like exporting or backing up the data it also makes sense. But if you are able to avoid scan operations, I would. Might require some rethinking of your DB structure tho.
Cat.scan("breed").in(["Terrier", "husky", "wolf"]).exec()
So the code above would work and I think is what you are asking for, but keep in mind the performance & cost hit you are taking here.

Firestore rules, atomic writes, and write limits

I have a two-part question regarding Firestore rule evaluations. The parts are related which is why the single question here...
Part I - Atomic write access
Let's say that I have a written rule such as
allow write: if resource.data.claimedBy == null && request.data.claimedBy == request.auth.uid;
The idea is that any user can make a claim to this resource. But, what if this resource is made available to 1000 users all at once and they all jump to make a claim and make the .update() call all at the same time?
Will this be a first wins scenario? Firebase will be set as the field for the first user, the winner, and then everyone else will have their writes rejected because the rule would fail due to a value being present from the winner's write? Or is there any risk whatsoever that a race condition could result and somehow for a moment the value was one thing and then became another?
I feel like the rules would prohibit a race condition, but I don't know for certain.
Part II - Write limits
Ok, so this part builds off of the first. Firestore has a write limit of 1 write/second in general for a single document. Let's assume Part I works how I hope and the other 999 users will get a write rejection. Do these rejections count towards the write limit of 1 write/sec because a write was initiated, or do they not count because the rules prohibit an actual write?
Obviously, having all these claim attempts at once count as 1000 writes would be bad for the 1 write/second limit.
I am assuming here, but I believe it would not count toward that limit because my understanding is that the limit is imposed by the nature of the underlying storage mechanics, and the rules prevent going to that layer upon rejection. But also again, I don't know for certain.
Part III - Bonus part
Do writes that are rejected by a rule still count as a "write" as far as billing is concerned? I know a query for a document that does not exist (no documents actually read) still counts as a single read, so I am wondering if writes with regards to rules which prohibit the underlying write works in a similar way and incurs a charge.
Thank you so much!
You should use a transaction to avoid and prevent concurrent writes. The transaction can check if the document was previously claimed, then abort if it was.
If a write was denied by a rule, it doesn't count toward any write limits or billing, as no data in the document was actually changed.

Search query to find documents that have multiple element

I have a few XML documents in marklogic which have the structure
<abc:doc>
<abc:doc-meta>
<abc:meetings>
<abc:meeting>
</abc:meeting>
<abc:meeting>
</abc:meeting>
</abc:meetings>
</abc:doc-meta>
</abc:doc>
We can have more than one <abc:meeting> element under the <abc:meetings> element.
I am trying to write a cts:search query to get only documents that have more than one <abc:meeting> element in the document.
Please advise
This is tricky. Ideally, you'd want to drive searches from indexes for best performance. Unfortunately, MarkLogic doesn't keep track of element counts in its universal index, and aggregating counts from a range index can be cumbersome.
The overall simplest solution would be to add a count attribute on abc:meetings, and then add a range index on that. It does mean you'd have to change your data, and you'd have to keep that attribute in synch with each change.
You could also just search on the presence of abc:meeting with cts:element-query(), and append an XPath predicate to count the number of elements afterwards. Something like:
cts:search(
collection(),
cts:element-query(xs:QName('abc:meeting'), cts:true-query())
)[count(.//abc:meeting) > 1]
If not many documents contain meetings, this might work fairly well for you, but it still requires pulling up all documents containing meetings, hence could be expensive.
I played with the thought of leveraging cts:near-query(), but that is driven on word positions, so depends on the actual amount of tokens inside a meeting. If that were always an exact number of tokens (unlikely I'd guess), you could use the minimal-distance option on a double cts:element-query() wrapped in a cts:near-query(). It might help optimize the previous option a little though.
Most performant option I can think of right now, involves adding a User-Defined aggregate Function. It unfortunately means compiling c++ code. I happen to have written such a UDF in the past, that you should be able to use as-is after compilation and installation. For details see:
https://github.com/grtjn/doc-count-udf
and
http://docs.marklogic.com/guide/app-dev/aggregateUDFs
HTH!
It boils down to how many "a few" is. If it's thousands or fewer, than what grtjn presents above for a cts:search plus an XPath expression will work fine. If it's more, I'd add the count attribute to abc:meetings and then use a pre-commit trigger (e.g. on the collection of these documents) to ensure that the count attribute value is kept in sync. You'd need a range index to be able to query for "Documents that have a count of meetings of 2 or greater".
Of course, if all you need to query on is whether there's more than one meeting, then just add a "multiple" attribute to abc:meetings with a value of "true". Then you don't need a range index - you can do a cts:element-attribute-value-query on abc:meetings and multiple="true".

REST resources with a triple as a parameter

When needing to create a URL that takes a finite set of parameters, where all of said parameters are semantically the same "level", what is the current consensus around the use of delimiters within URLs? Here's an example:
/myresource/thing1,thing2,thing3
/myresource/thing2,thing1
/myresource/thing1;thing2;thing3
/myresource/thing1;thing3
That is to say, the parameter here could be a single, a pair or a triple. They can be specified in any order because they are not a logical tree, and thing2 is not a subordinate resource of thing1, so doing something like this seems "wrong":
/myresources/thing1/thing2/thing3
This bothers me because it implies a tree-like relationship between the elements of the triple, and that is not the case (despite many HTTP frameworks seemingly pushing this, wrongly in my view). In addition, using a query string doesn't feel right as this is not a search operation, it is a known triple in a very finite space - there's nothing to query or search, so to speak.
I suppose the other option would be to make it a POST request and supply a body that details the parts of the triple being supplied. This doesn't give me warm fuzzies though, for some reason.
How have others handled this? Delimiters seem clean to me, and communicate the intended semantics of the resource, but i know there are folks would would take a different view, and I was looking to understand the experiences of others who've had similar use cases.
Since any value can be missing and values can appear in any order, How would you know which value is for which parameter (if that matters).
I would have used query string for GET, or in the payload for POST.
Use query parameters
/path/to/the/resource?key1=value1&key2=value2&key3=value3
or matrix parameters
/path/to/the/resource;key1=value1;key2=value2;key3=value3
Without a proper example, I'm not sure exactly about your needs.
However, a little known fact is that any HTTP parameter can have multiple values. It is the way to go when you have a set of objects (see GoogleMaps static API for an example).
/path/to/the/resource?things=thing1&things=thing2&things=thing3
Then you can use the same API for single, pairs, triples (and more).

Resources