On how many process do cloud function run? - firebase

I would like to use google cloud functions to update an additional database living on heroku because firebase realtime database is not cutting it.
However there is a limit of concurrent connections that are accepted on heroku, so I'm wondering how many of them will be opened via cloud functions ? Is it 1 ? Is there no limit ?
I've something like this in my functions
import { Pool } from 'pg';
const pool = new Pool(connectionString);
exports.onAuth = functions.auth.user().onCreate(event => {
pool.query(...)
});

Cloud Functions will spin up as many server instances as is required to meet the demand on your functions. It could be as low as 0 if there is no load, and much higher if there is a lot of load. Each instance could be doing work for your function. You can't constrain the number of instances for your functions - it just scales automatically on demand. So you can think of it as having no limit (as long as you are paying your bills).

Related

Firebase Functions returns error of Bandwidth Exhausted

We are using Firebase Functions with a few different HTTP functions .
One of the functions runs via a manual trigger from our website. It then pulls in a lot of data from an external resource and saves it into our Firestore database. Our function resources are Node.js 10, 1 GB of Memory and 540s before it times out.
However, when we have large datasets that we need to pull in, e.g. 5 000 - 10 000 records to write to the database, we start running into issues. We receive an error on large data sets of:
8 RESOURCE_EXHAUSTED: Bandwidth exhausted
The full error on Firebase Functions Health Dashboard logs looks like this:
Error: 8 RESOURCE_EXHAUSTED: Bandwidth exhausted
at Object.callErrorFromStatus (/workspace/node_modules/#grpc/grpc-js/build/src/call.js:31:26)
at Object.onReceiveStatus (/workspace/node_modules/#grpc/grpc-js/build/src/client.js:176:52)
at Object.onReceiveStatus (/workspace/node_modules/#grpc/grpc-js/build/src/client-interceptors.js:342:141)
at Object.onReceiveStatus (/workspace/node_modules/#grpc/grpc-js/build/src/client-interceptors.js:305:181)
at Http2CallStream.outputStatus (/workspace/node_modules/#grpc/grpc-js/build/src/call-stream.js:117:74)
at Http2CallStream.maybeOutputStatus (/workspace/node_modules/#grpc/grpc-js/build/src/call-stream.js:156:22)
at Http2CallStream.endCall (/workspace/node_modules/#grpc/grpc-js/build/src/call-stream.js:142:18)
at ClientHttp2Stream.stream.on (/workspace/node_modules/#grpc/grpc-js/build/src/call-stream.js:420:22)
at ClientHttp2Stream.emit (events.js:198:13)
at ClientHttp2Stream.EventEmitter.emit (domain.js:466:23)
Our Firebase project is on the blaze plan and also, on GCP connected to an active billing account.
Upon inspection on GCP, it seems like we are NOT exceeding our WRITES per minute quote, as previously thought, however, we are exceeding our Cloud Build limit. We are also using batched writes when we save data to firestore from within the function, which seems to also make the amount of db writes less. e.g.
We don't use Cloud Build, so I assume that Firebase Functions uses Cloud Build in the back end to run the functions or something, but I can't find any documentation on the matter. We also have a few firestore database functions that run when documents are created. Not sure if that uses Cloud build in the back end or not.
Any idea why this would happen ? Whenever this happens, our function gets terminated with that error which causes us to only import half of our data. The data import works flawlessly with smaller amounts of data.
See our usage here for this particular project:
Cloud Build is used during the deployment of Cloud Functions. If you check this documentation you can see that:
Deployments work by uploading an archive containing your function's source code to a Google Cloud Storage bucket. Once the source code has been uploaded, Cloud Build automatically builds your code into a container image and pushes that image to Container Registry. Cloud Functions uses that image to create the container that executes your function.
This by itself is not enough to justify the charges you are seeing, but if you check the container image documentation it says:
Because the entire build process takes place within the context of your project, the project is subject to the pricing of the included resources:
For Cloud Build pricing, see the Pricing page. This process uses the default instance size of Cloud Build, as these instances are pre-warmed and are available more quickly. Cloud Build does provide a free tier: please review the pricing document for further details.
So with that information in mind, I would make an educated guess that your website is triggering the HTTP function enough times to make Cloud Functions scale up this particular function with new intances of it, which triggers a build process for the container that hosts the function and charges you as a Cloud Build charge. So to keep doing what you doing you are going to have to increase your Cloud Build Quota to meet this demand of your website.
There was a Firestore trigger that was triggering on new records of the same type I was importing.
So in short, I was creating thousands of records in a collection, and for every one of those, the firestore rule (function) triggered, but what I did not know at the time, is that it created a new build process in the background for each firestore trigger that ran, which is not documented anywhere.

How can I have a continuous firebase cloud function for a continuous stream of data?

I need to use the Twitter Stream API to stream tweet data to my firebase cloud function like this:
client.stream('statuses/filter', params, stream => {
stream.on('data', tweet => {
console.log(tweet);
})
stream.on('error', error => {
console.log(error)
})
})
The stream is continuous but the firebase cloud function shuts down after a certain period of time. What solution could I make use of to be able to continuously receive the stream data?
Cloud Functions have a max running time of 540 seconds as documented. You would have to look at probably using a Compute Engine Instance from Google Cloud where you can have code running without limits. Or you could look at using the Google Cloud Scheduler to run your function every x time to get new tweets.
The accepted response suggests running GCE and while it's certainly correct, I'd like to point out that anyone who was interested in Cloud Functions - a serverless solution - might find GAE (App Engine) much more viable for streaming data.
Our application utilises App Engine Standard as an ingestion service and it works like a charm - removing overhead required by GCE. If advanced networking features are required by your app App Engine Flexible or GKE (Kubernetes Engine) might also be something to look at!

Firebase: First write is slow

Currently developing a hybrid mobile app using ionic. When the app starts up, and a user writes to the Realtime Database for the first time, it's always delayed by around 10 or more seconds. But any subsequent writes are almost instantaneous (less than 1 second).
My calculation of delay is based on watching the database in the Firebase console.
Is this a known issue, or maybe I am doing something wrong. Please share your views.
EDIT:
The write is happening via Firebase Cloud Function.
This is the call to the Firebase Cloud function
this.http.post(url+"/favouritesAndNotes", obj, this.httpOptions)
.subscribe((data) => {
console.log(data);
},(error)=>{
console.log(error);
});
This is the actual function
app.post('/favouritesAndNotes', (request, response) => {
var db = admin.database().ref("users/" + request.body.uid);
var favourites = request.body.favourites;
var notes = request.body.notes;
if(favourites!==undefined){
db.child("favourites/").set(favourites);
}
if(notes!==undefined){
db.child("notes/").set(notes);
}
console.log("Write successfull");
response.status(200).end();
});
The first time you interact with the Firebase Database in a client instance, the client/SDK has to do quite some things:
If you're using authentication, it needs to check if the token that it has is still valid, and if not refresh it.
It needs to find the server that the database is currently hosted on.
It needs to establish a web socket connection.
Each of these may take multiple round trips, so even if you're a few hundred ms from the servers, it adds up.
Subsequent operations from the same client don't have to perform these steps, so are going to be much faster.
If you want to see what's actually happening, I recommend checking the Network tab of your browser. For the realtime database specifically, I recommend checking the WS/Web Socket panel of the Network tab, where you can see the actual data frames.

Firebase: Cloud Functions, How to Cache a Firestore Document Snapshot

I have a Firebase Cloud Function that I call directly from my app. This cloud function fetches a collection of Firestore documents, iterating over each, then returns a result.
My question is, would it be best to keep the result of that fetch/get in memory (on the node server), refreshed with .onSnapshot? It seems this would improve performance as my cloud function would not have to wait for the Firestore response (it would already have the collection in memory). How would I do this? Simple as populating a global variable? How to do .onSnaphot realtime listener with cloud functions?
it might depend how large these snapshots are and how many of them may be cached ...
because, it is a RAM disk and without house-keeping it might only work for a limited time.
Always delete temporary files
Local disk storage in the temporary directory is an in-memory file-system. Files that you write consume memory available to your function, and sometimes persist between invocations. Failing to explicitly delete these files may eventually lead to an out-of-memory error and a subsequent cold start.
Source: Cloud Functions - Tips & Tricks.
It does not tell there, what exactly the hard-limit would be - and caching elsewhere might not improve access time that much. it says 2048mb per function, per default - while one can raise the quotas with IAM & admin. it all depends, if the quota per function can be raised far enough to handle the cache.
here's an example for the .onShapshot() event:
// for a single document:
var doc = db.collection('cities').doc('SF');
// this also works for multiple documents:
// var docs = db.collection('cities').where('state', '==', 'CA');
var observer = doc.onSnapshot(docSnapshot => {
console.log(`Received doc snapshot: ${docSnapshot}`);
}, err => {
console.log(`Encountered error: ${err}`);
});
// unsubscribe, to stop listening for changes:
var unsub = db.collection('cities').onSnapshot(() => {});
unsub();
Source: Get realtime updates with Cloud Firestore.
Cloud Firestore Triggers might be another option.

Firebase access latency

we have an issue with Firebase access latency.
We have Node.js application with Firebase SDK.
In the code we have next consequence of requests to Firebase:
1. Get keys by GeoFire
2. Then get serveral (from 3 to 100) branches by keys in parallel.
3. Then get same number of branches from another entity by keys in parallel.
In Javascript dialect parallel requests looks like this:
const cardsRefs = map(keys, (key) => this.db.ref('cards').child(key));
return Promise.all(map(cardsRefs, (ref) => {
return ref
.once('value')
.then((snapshot) => snapshot.val())
})
);
That's all, not so big, I think.
But it can take a lot of time (up to 5000ms). And we can't be sure about this time, because it can be 700ms sometimes and sometimes it can be much more (up to 5000ms). We expected more predictible behaviour at this point from Firebase.
First, we thought that the reason of unstable latency in wrong choosen Google Cloud location (code was ran on Google Compute Engine). We tried from us-west1 an us-central1. Please, lead us to right datacenter location.
But then we rewrote code into Google Function and got same result.
Please, tell us, how we can get more predictible and sustainable latency on Firebase requests?
We migrate our functions with the backend to cloud functions and the situation has improved. Each function approximately began to work faster for 1.5 - 2 seconds. At the moment, we are satisfied with this situation.

Resources