Firebase storage (files) backup - firebase

We have a scheduled function that processes a firebase backup.
For Firestore DB, we use a built-in method - https://firebase.google.com/docs/firestore/solutions/schedule-export
However, we also need to back up user files that are placed in the storage. I did not find a good solution for files backup, so I ended up using of bucket.copy method - it will copy files one by one and put them into another storage.
That worked fine for some period, but now the function experiences a timeout error (as we have a lot of files already) - Function execution took 540026 ms, finished with status: 'timeout'.
As the amount of files is growing every day, I am not sure how to fix the timeout issue. Could you please advise?
Would you recommend using another way? If so, please specify

Increase your memory and add parameter in functions
Try below code
const runtimeOpts = {
timeoutSeconds: 300,
memory: '1GB'
}
exports.getMessages = functions.runWith(runtimeOpts).https.onCall((data, context) => {
}

Related

Firebase One-time Functions

I'm sure these are common scenarios, but after researching some hours, I couldn't really find what the common practice is. Maybe someone with more experience in Firebase can point me to the right direction.
I have two particular scenarios:
1. Code that runs once
Example 1: adding new data to all users in firestore, which is needed for a new feature
Example 2: start duplicating data into existing documents
I currently write the code in a cloud function, and run it on a firestore event (onUpdate of a "hidden" document) and then I immediately delete the function if everything goes well.
I also increase the timeout and memory for this function, as the idea is to potentially update millions of documents.
2. Manually trigger a function from the firebase console (or command line)
Example: Give a user admin privileges in the app (function that sets custom claims and firestore data). We don't have time to implement a back-office, so doing this from the firebase web portal/console would be ideal, specifying the user id.
My current solution is to use a https function, and run it from the GCP portal (on the function's "Testing" tab, being able to pass a json). BUT the function can be triggered publicly, which I don't really like...
What are the common practices for these scenarios?
To expand on my comment: if you want to create a node script to run one-off code, you just write your JS code like for any cloud function but simply run it immediately. Something like this.
const admin = require('firebase-admin');
admin.initializeApp();
const db = admin.firestore();
db.collection('users')
.where('myField', '==', true)
.get()
.then(querySnapshot => {
querySnapshot.docs.forEach(doc => {
// update doc
});
});
If you save this as script.js and execute it with node script.js you’ll be pointed towards downloading a JSON file with credentials. Follow the instructions and you can then run the script again and now you’re running your own code on Firestore, from the command line.
For administrative type operations, you're better off just running them on your desktop or some other server you control. Cloud Functions is not well suited for long running operations, or things that must just happen once on demand.
Case 1 really should be managed by a standalone program or script that you can monitor by running it on your desktop.
Case 2 can be done a number of ways, such as building your own admin web site. But you might find it easiest to mirror the contents of a document to custom claims using a Firestore trigger. Read this: https://medium.com/firebase-developers/patterns-for-security-with-firebase-supercharged-custom-claims-with-firestore-and-cloud-functions-bb8f46b24e11

Firebase: Cloud Functions, How to Cache a Firestore Document Snapshot

I have a Firebase Cloud Function that I call directly from my app. This cloud function fetches a collection of Firestore documents, iterating over each, then returns a result.
My question is, would it be best to keep the result of that fetch/get in memory (on the node server), refreshed with .onSnapshot? It seems this would improve performance as my cloud function would not have to wait for the Firestore response (it would already have the collection in memory). How would I do this? Simple as populating a global variable? How to do .onSnaphot realtime listener with cloud functions?
it might depend how large these snapshots are and how many of them may be cached ...
because, it is a RAM disk and without house-keeping it might only work for a limited time.
Always delete temporary files
Local disk storage in the temporary directory is an in-memory file-system. Files that you write consume memory available to your function, and sometimes persist between invocations. Failing to explicitly delete these files may eventually lead to an out-of-memory error and a subsequent cold start.
Source: Cloud Functions - Tips & Tricks.
It does not tell there, what exactly the hard-limit would be - and caching elsewhere might not improve access time that much. it says 2048mb per function, per default - while one can raise the quotas with IAM & admin. it all depends, if the quota per function can be raised far enough to handle the cache.
here's an example for the .onShapshot() event:
// for a single document:
var doc = db.collection('cities').doc('SF');
// this also works for multiple documents:
// var docs = db.collection('cities').where('state', '==', 'CA');
var observer = doc.onSnapshot(docSnapshot => {
console.log(`Received doc snapshot: ${docSnapshot}`);
}, err => {
console.log(`Encountered error: ${err}`);
});
// unsubscribe, to stop listening for changes:
var unsub = db.collection('cities').onSnapshot(() => {});
unsub();
Source: Get realtime updates with Cloud Firestore.
Cloud Firestore Triggers might be another option.

Firebase functions slow cold start time

I read here endpoint spin-up is supposed to be transparent, which I assume means cold start times should not differ from regular execution times. Is this still the case? We are getting extremely slow and unusable cold start times - around 16 seconds - across all endpoints.
Cold start:
Function execution took 16172 ms, finished with status code: 200
After:Function execution took 1002 ms, finished with status code: 304
Is this expected behaviour and what could be causing it?
UPDATE: The cold start times seem to no longer be an issue with node 8, at least for me. I'll leave my answer below for any individuals curious about keeping their functions warm with a cron task via App Engine. However, there is also a new cron method available that may keep them warm more easily. See the firebase blog for more details about cron and Firebase.
My cold start times have been ridiculous, to the point where the browser will timeout waiting for a request. (like if it's waiting for a Firestore API to complete).
Example
A function that creates a new user account (auth.user().onCreate trigger), then sets up a user profile in firestore.
First Start After Deploy: consistently between 30 and 60 seconds, frequently gives me a "connection error" on the first try when cold (this is after waiting several seconds once Firebase CLI says "Deploy Complete!"
Cold Start: 10 - 20 seconds
When Warm: All of this completes in approximately 400ms.
As you can imagine, not many users will sit around waiting more than a few seconds for an account to be setup. I can't just let this happen in the background either, because it's part of an application process that needs a profile setup to store input data.
My solution was to add "ping" function to all of my API's, and create a cron-like scheduler task to ping each of my functions every minute, using app engine.
Ensure the ping function does something, like access a firestore document, or setup a new user account, and not just respond to the http request.
See this tutorial for app engine scheduling: https://cloud.google.com/appengine/docs/flexible/nodejs/scheduling-jobs-with-cron-yaml
Well it is about resource usage of Cloud Functions I guess, I was there too. While your functions are idle, Cloud Functions also releases its resources, at first call it reassignes those resources and at second call you are fine. I cannot say it is good or not, but that is the case.
if you try to return a value from an async function there won't be any variables in the main function definition (functions.https.onCall) and GCP will think that the function has finished and try to remove resources from it.
Previous breaking code (taking 16 + seconds):
functions.https.onCall((data, context) => {
return asyncFunction()
});
After returning a promise in the function definition the function times are a lot faster (milliseconds) as the function waits for the promise to resolve before removing resources.
functions.https.onCall((data, context) => {
return new Promise((resolve) => {
asyncFunction()
.then((message) => {
resolve(message);
}).catch((error) => {
resolve(error);
});
});
});

Google Cloud Functions with ECONNRESET errors until I redeploy

I'm using Google Cloud Functions to:
Watch for a new Firebase entry
Download a file that's referenced in the Firebase entry
Generate a thumbnail based on that file.
Upload the thumbnail to the cloud bucket.
Unfortunately I'm getting ECONNRESET errors repeatedly on step 4, and the only way to fix it seems to be to redeploy the function. Any ideas how to further debug this?
Edit:
It seems like many times when this happens, when I try to deploy the function again, it errors, and I have to run the deploy twice. Is something hanging or something?
Update May 9 2017
According to this thread, the google cloud nodejs API developers have made some changes to the defaults that are used when initializing that should solve these ECONNRESET socket issues.
From #stephen++ on github GoogleCloudPlatform/google-cloud-node issue 2254:
We have disabled the forever agent by default in Cloud Function
environments. If you un- and re-install #google-cloud/storage, you
will pick up the new behavior automatically. Thanks for all of the
helpful debugging, everyone!
Older Post Follows:
The solution for me to similar ECONNRESET issues using storage on the cloud functions platform was to use npm:promise-retry, but set up your own retry strategy because the default of 10 retries is too many.
I reported an ECONNRESET issue with cloud functions to Google Support (which you might star if you are also getting ECONNRESET in this context but not in other contexts) and they replied with a "won't fix" that the behavior is expected. Google support said the socket that the API client library uses to connect times out after a few minutes, and then when your cloud function tries to use it again you get ECONNRESET. They recommended adding autoRetry:true when initializing the storage API, but that did not help.
The ECONNRESETs happen on the read side too. In both read and write cases promise-retry helps, and most of the time with only 1 retry needed to reset the bad socket.
So I wrote npm:pipe-to-storage to return a promise to do the retries, check md5, etc., but I haven't tested it with binary data, only text, so I don't know if you could use it with image files. The calls would look like this:
const fs = require('fs');
const storage = require('#google-cloud/storage')();
const pipeToStorage = require('pipe-to-storage')(storage);
const source = ()=>(fs.createReadStream("/path/to/your/file/to/upload"));
pipeToStorage(source, bucketName, fileNameInBucket).then(//do next step);
See also How do I read the contents of a new cloud storage file of type .json from within a cloud function?
You can directly report a bug to the Firebase Support team, or open a support ticket with Firebase to troubleshoot a specific issue.
You may also report a Cloud Functions specific issue in the Google Issue Tracker, which is similar to Stack Overflow in that it is accessible by the public (but specifically used for filing issue reports).

Monitor meteorjs active reactive connections

We have a problem with our meteor server. When we publish 300 or so items with Meteor.publish/Meteor.subscribe the server increases its memory and eventually becomes unresponsive.
We thought of:
1) monitor the number of reactive subscribtions / memory taken by an active subscription
2) make something like ,,one time publish" - ignore changes in server side collection
Any thoughts on how any of the above can be accomplished ?
Or any other tips to debug /improve meteor app performance ?
Thanks
zorlak's answer is good.
Some other things:
You can do a one-time publish by writing your own custom publisher via the this.set API, based on the code in _publishCursor. You'd do something like:
Meteor.publish("oneTimeQuery", function () {
MyCollection.find().forEach(function (doc) {
sub.added("collectionName", doc._id, doc);
});
sub.ready();
});
This does a query, sends its results down, and then never updates it again.
That said, we hope that Meteor's performance will be such that this is unnecessary!
I'd also like to add an easy way to get stats (like number of observed cursors) from an app to Meteor (exposed as an authenticated subscription) but haven't had the time yet.
As of Meteor 0.5.1, one thing you can do is to remove dependencies on the userId from the publish function. If a publish function does not depend on which user is subscribing, then Meteor will cache the db query so it doesn't get slower as more users subscribe.
See this Meteor blog post: http://meteor.com/blog/2012/11/20/meteor-051-database-scaling

Resources