Throttling callable https firebase cloud function execution per user? - firebase

I was not able to find any resources about this, hence wanted to ask if it is a good idea / necessary to add throttling to callable https cloud functions in firebase on per user basis?
Example, I want to limit one user to be only able to call https function every 5 seconds.
If it is a viable thing to do, how would it be acheived?

There is not any inbuilt per user throttling capabilities in cloud functions. You have a few options of doing your own:
Put logic in your client side apps that tracks the amount of times a user is calling them and deny the call if too frequent
Issue here is that if someone is trying to game you this wouldn't be 100% effective as they could use multiple windows, etc.
You could implement a database solution where you track their usage and at the beginning of your function you check if they are violating your rate limit
Issue here is you are still having the triggers of your functions incurring the costs.
If it was a super big issue for you, I would recommend looking at using an API management platform such as Apigee where you can apply policies such as rate limiting
This a heavy weight solution with an increased cost and so wouldn't do it unless necessary

Related

Firebase cloud functions pricing for denied requests [duplicate]

As far as I understand, my Google Cloud Functions are globally accessible. If I want to control access to them, I need to implement authorization as a part of the function itself. Say, I could use Bearer token based approach. This would protect the resources behind this function from unauthorized access.
However, since the function is available globally, it can still be DDoS-ed by a bad guy. If the attack is not as strong as Google's defence, my function/service may still be responsive. This is good. However, I don't want to pay for those function calls made by the party I didn't authorize to access the function. (Since the billing is per number of function invocations). That's why it's important for me to know whether Google Cloud Functions detect DDoS attacks and enable counter-measures before I'm being responsible for charges.
I think the question about DDOS protection has been sufficiently answered. Unfortunately the reality is that, DDOS protection or no, it's easy to rack up a lot of charges. I racked up about $30 in charges in 20 minutes and DDOS protection was nowhere in sight. We're still left with "I don't want to pay for those function calls made by the party I didn't authorize to access the function."
So let's talk about realistic mitigation strategies. Google doesn't give you a way to put a hard limit on your spending, but there are various things you can do.
Limit the maximum instances a function can have
When editing your function, you can specify the maximum number of simultaneous instances that it can spawn. Set it to something your users are unlikely to hit, but that won't immediately break the bank if an attacker does. Then...
Set a budget alert
You can create budgets and set alerts in the Billing section of the cloud console. But these alerts come hours late and you might be sleeping or something so don't depend on this too much.
Obfuscate your function names
This is only relevant if your functions are only privately accessed. You can give your functions obfuscated names (maybe hashed) that attackers are unlikely to be able to guess. If your functions are not privately accessed maybe you can...
Set up a Compute Engine instance to act as a relay between users and your cloud functions
Compute instances are fixed-price. Attackers can slow them down but can't make them break your wallet. You can set up rate limiting on the compute instance. Users won't know your obfuscated cloud function names, only the relay will, so no one can attack your cloud functions directly unless they can guess your function names.
Have your cloud functions shut off billing if they get called too much
Every time your function gets called, you can have it increment a counter in Firebase or in a Cloud Storage object. If this counter gets too high, your functions can automatically disable billing to your project.
Google provides an example for how a cloud function can disable billing to a project: https://cloud.google.com/billing/docs/how-to/notify#cap_disable_billing_to_stop_usage
In the example, it disables billing in response to a pub/sub from billing. However the price in these pub/subs is hours behind, so this seems like a poor strategy. Having a counter somewhere would be more effective.
I have sent an email to google-cloud support, regarding cloud functions and whether they were protected against DDoS attacks. I have received this answer from the engineering team (as of 4th of April 2018):
Cloud Functions sits behind the Google Front End which mitigates and absorbs many Layer 4 and below attacks, such as SYN floods, IP fragment floods, port exhaustion, etc.
I have been asking myself the same question recently and stumbled upon this information. To shortly answer your question: Google does still not auto-protect your GCF from massive DDOS-attacks, hence: unless the Google infrastructure crashes from the attack attempts, you will have to pay for all traffic and computing time caused by the attack.
There is certain mechanisms, that you should take a closer look at as I am not sure, whether each of them also applies to GCF:
https://cloud.google.com/files/GCPDDoSprotection-04122016.pdf
https://projectshield.withgoogle.com/public/
UPDATE JULY 2020: There seems to be a dedicated Google service addressing this issue, which is called Google Cloud Armor (Link to Google) as pointed out by morozko.
This is from my own, real-life, experience: THEY DON'T. You have to employ your own combo of rules, origin-detection, etc to protect against this. I've recently been a victim of DDoS and had to take the services down for a while to implement my own security wall.
from reading the docs at https://cloud.google.com/functions/quotas and https://cloud.google.com/functions/pricing it doesn't seem that there's any abuse protection for HTTP functions. you should distinguish between a DDoS attack that will make Google's servers unresponsive and an abuse that some attacker knows the URL of your HTTP function and invokes it millions of times, which in the latter case is only about how much you pay.
DDoS attacks can be mitigated by the Google Cloud Armour which is in the beta stage at the moment
See also related Google insider's short example with GC Security Rules and the corresponding reference docs
I am relatively new to this world, but from my little experience and after some research, it's possible to benefit from Cloudflare's DDOS protection on a function's http endpoint by using rewrites in your firebase.json config file.
In a typical Firebase project, here's how I do this :
Add cloud functions and hosting to the project
Add a custom domain (with Cloudflare DNSs) to the hosting
Add the proper rewrites to your firebase.json
"hosting": {
// ...
// Directs all requests from the page `/bigben` to execute the `bigben` function
"rewrites": [ {
"source": "/bigben",
"function": "bigben",
"region": "us-central1"
} ]
}
Now, the job is on Cloudflare's side
One possible solution could be the API Gateway, where you can use firebase authentication. After successful authentication to the api gw it can call your function that deployed with --no-allow-unauthenticated flag.
However I'm confused if you are charged for unauthenticated requests to api gw too..

Does Firestore have an api that allows me to check daily reads/writes so I know when I'm over a certain amount?

I know that I could implement a counter in my application but using an api would still be a cleaner solution - if one exists?
Basically, Firestore has Spark free tier limits (think 50,000 reads/day) that I don't want to exceed. So whenever my app was going to do firestore reads, I would like a way to simply ask firestore whether I'm over a certain number.
I'm also reading that Google intentionally got rid of Firebase spending limits.. which seems really sketchy... Impossible to set the Cloud Firebase daily spending limit
There is no such API as part of Firebase. The ways to monitor usage are documented here, but none of them is an API.
You might be able to get some data through the Cloud Monitoring API. But this API isn't made for client-side access though, so you'll have to wrap it yourself.
A final alternative would be to look at a service like https://firerun.io/ who automate a lot of this.

Best strategy to develop back end of an app with large userbase, taking into account limitations of bandwidth, concurrent connections etc.?

I am developing an Android app which basically does this: On the landing(home) page it shows a couple of words. These words need to be updated on daily basis. Secondly, there is an 'experiences' tab in which a list of user experiences (around 500) shows up with their profile pic, description,etc.
This basic app is expected to get around 1 million users daily who will open the app daily at least once to see those couple of words. Many may occasionally open up the experiences section.
Thirdly, the app needs to have a push notification feature.
I am planning to purchase a managed wordpress hosting, set up a website, and add a post each day with those couple of words, use the JSON-API to extract those words and display them on app's home page. Similarly for the experiences, I will add each as a wordpress post and extract them from the Wordpress database. The reason I am choosing wordpress is that it has ready made interfaces for data entry which will save my time and effort.
But I am stuck on this: will the wordpress DB be able to handle such large amount of queries ? With such a large userbase and spiky traffic, I suspect I might cross the max. concurrent connections limit.
What's the best strategy in my case ? Should I use WP, or use firebase or any other service ? I need to make sure the scheme is cost effective also.
My app is basically very similar to this one:
https://play.google.com/store/apps/details?id=com.ekaum.ekaum
For push notifications, I am planning to use third party services.
Kindly suggest the best strategy I should go with for designing the back end of this app.
Thanks to everyone out there in advance who are willing to help me in this.
I have never used Wordpress, so I don't know if or how it could handle that load.
You can still use WP for data entry, and write a scheduled function that would use WP's JSON API to copy that data into Firebase.
RTDB-vs-Firestore scalability states that RTDB can handle 200 thousand concurrent connections and Firestore 1 million concurrent connections.
However, if I get it right, your app doesn't need connections to be active (i.e. receive real-time updates). You can get your data once, then close the connection.
For RTDB, Enabling Offline Capabilities on Android states that
On Android, Firebase automatically manages connection state to reduce bandwidth and battery usage. When a client has no active listeners, no pending write or onDisconnect operations, and is not explicitly disconnected by the goOffline method, Firebase closes the connection after 60 seconds of inactivity.
So the connection should close by itself after 1 minute, if you remove your listeners, or you can force close it earlier using goOffline.
For Firestore, I don't know if it happens automatically, but you can do it manually.
In Firebase Pricing you can see that 100K Firestore document reads is $0.06. 1M reads (for the two words) should cost $0.6 plus some network traffic. In RTDB, the cost has to do with data bulk, so it requires some calculations, but it shouldn't be much. I am not familiar with the pricing small details, so you should do some more research.
In the app you mentioned, the experiences don't seem to change very often. You might want to try to build your own caching manually, and add the required versioning info in the daily data.
Edit:
It would possibly be more efficient and less costly if you used Firebase Hosting, instead of RTDB/Firestore directly. See Serve dynamic content and host microservices with Cloud Functions and Manage cache behavior.
In short, you create a HTTP function that reads your database and returns the data you need. You configure hosting to call that function, and configure the cache such that subsequent requests are served the cached result via hosting (without extra function invocations).

Firebase Cloud Functions Response Time

so up until mid 2018 there have been complaints about performance issues with Firebase Cloud Functions and Google CFs (which are the same under the hood I believe). Like these ones:
https://github.com/googleapis/google-cloud-node/issues/2374
https://github.com/firebase/firebase-functions/issues/161
I remember seeing that a simple Hello World example had a response time of 500ms - 800ms. EDIT: I know about cold starts, but as described in the GitHub issues cold starts were not the main problem. A Firebase Cloud Functions would randomly take up to 10s to respond which looked like a problem within Firebase.
I am currently considering building a project with Firebase and would like to build a REST API with Firebase cloud functions - but bad performance would be a deal breaker.
What's the current status? Do these problems still occur? None of these GitHub issues were properly answered by Google, but also no more users have complained ever since …
Cold start times are a fact of life for serverless backends such as Cloud Functions. It's due to the way server instances are automatically scaled up and down to handle load in a cost-effective way. You can always expect that the first request to a new server instance will take some amount of time longer than the subsequent requests that get directed to that same server instance. That amount of time will be variable depending on a number of factors, including the type of trigger, and what all needs to happen with the first request.
If you want to learn more about Cloud Functions scale, what you can expect as a result, and what you can do to mitigate cold starts, watch my video series on the matter.
Cloud Functions for Firebase are Google cloud Functions with a wrapper to allow them to integrate better with other Firebase products. Therefor it is expected a small loss of performance.
The important part to decide which one to use is more to what are you integrating the most.
If your project is running in Firebase, uses firebase authentication etc then Cloud Functions for Firebase is the best choice.
On the other hand if you are using Google Cloud Platform Products then Google Cloud Funtions is the best choice.

Performance difference Firestore through Firebase Functions vs Firestore SDK

Our team is developing a mobile app and is currently in use of (Firebase) Firestore for our backend. We wrapped every DB access with Firebase Functions in order to clean up the object returned to the client app.
Does this approach introduce any (additional) unignorable overhead compared to accessing to Firestore directly?
Yes but No depending on your use case.
If you have small amount of users with relatively low usage (in terms of the given quota), it is recommended to apply Cloud Functions. As stated in the documentation, Firebase Cloud Function offers big quota in terms of Resource limits, Time limits and Rate limits with good pricing especially for the Spark plan (FREE).
The advantage of using Cloud Functions is that it has a high speed and scalable computing / processing unit which could shorten the processing time of a specific function as compared with using the mobile phone CPU which in some cases the mobile phones has low computing power (have to consider various users as not everyone own a high spec phone), in order to provide better user experience (UX), all this hassle can be done by Cloud Function!
Note: I do agree with Doug where cost is one of the factor, but we should also consider the performance and other perspective.
Yes, at the very least, now your path to get data has two hops instead of one. Before, you directly accessed the database using a channel that's optimized for returning the query results. Now, you have to pay the cost of an additional hop to Cloud Functions, which makes the query. And it's possible that the results returned to the client are bigger than if you made the query directly.
Perhaps the biggest loss you'll experience is the client side caching of documents that's automatically performed by the client (enabled by default on Android and iOS). If you repeat a query and none of the documents have changed, you get immediate results from the cache instead of having to wait for the server. And you won't have to pay for document reads for cache hits. So, if you aren't also caching your results, you're also paying the monetary cost of Cloud Functions and the query to Firestore for every request.
Yes, but the answer could be different based on the situation.
If a client wants to fetch a record exactly as in the database, the Firebase SDK might be faster because there is no overhead calling the Firebase Functions.
If we have a heavy processing after fetching a record, then Firebase Functions + Firebase Admin SDK could be faster because the processing unit in Firebase Functions could be faster than mobile CPU. However, if the request responds faster, the client app could display an additional message that something was fetched and currently in process during the heavy processing, the user experience could be acceptable.
The only case I can come up with Firebase Functions could always win is that the server reduces the data size so that the overhead introduced by Firebase Functions (including processing time) was compensated by the shorter network delay. This also has advantage of saving client's data plan.

Resources