Cloud Functions: zip multiple documents from Cloud Storage - firebase

I already searched through a lot of questions on stack overflow but couldn't find a fitting answer from which I can derive the answer I need:
I want to zip multiple files from a folder within Google Cloud Storage/Firebase Storage with a Cloud Function.
I already found the solution for zipping documents from the local filesystem but could not derive how to do it within a Cloud Function for Cloud Storage.

Google Cloud Storage supports the decompressive form of transcoding but not a compressive form of transcoding. However, at Cloud Storage, any user can store a gzip-compressed file.
To zip multiple documents from Cloud Storage using Cloud Functions, you can download the files from Cloud Storage to functions instances using gcs.bucket.file(filePath). download, zip the file, and re-upload the files to the Cloud Storage. Here you will find an example of downloading, transforming, and uploading a file. You can find an example to zip multiple files in this StackOverflow thread. This document explains how you can upload objects to Cloud Storage using Console, Gsutil, Code Sample, or REST APIs.

A bit late, but I had the same problem to solve.
The following Firebase Function:
Runs with 1 GB / 120 seconds timeout (for good measure)
Is triggered by WRITE calls (do this only if you have few calls!)
Ignores all paths except background_thumbnail/
Creates a random working directory and deletes it afterwards
Downloads images from Firebase Storage
Zips these images in a folder: background_thumbnail/<IMAGE>
Uploads created ZIP to Firebase Storage
Creates a signed URL for the ZIP file at Firebase Storage
Stores the signed URL in Firestore.
The code can probably be improved and made more elegant, but it works (for now).
const {v4: uuidv4} = require("uuid"); // for random working dir
const JSZip = require("jszip");
exports.generateThumbnailZip = functions
.runWith({memory: "1GB", timeoutSeconds: 120})
.region("europe-west3")
.storage.object()
.onFinalize(async (object) => {
// background_thumbnail/ is the watched folder
if (!object.name.startsWith("background_thumbnail/")) {
return functions.logger.log(`Aborting, got: ${object.name}.`);
}
const jszip = new JSZip();
const bucket = admin.storage().bucket();
const fileDir = path.dirname(object.name);
const workingDir = path.join(os.tmpdir(), uuidv4());
const localZipPath = path.join(workingDir, `${fileDir}.zip`);
const remoteZipPath = `${fileDir}.zip`;
await mkdirp(workingDir);
// -------------------------------------------------------------------
// DOWNLOAD and ZIP
// -------------------------------------------------------------------
const [files] = await bucket.getFiles({prefix: `${fileDir}/`});
for (let index = 0; index < files.length; index++) {
const file = files[index];
const name = path.basename(file.name);
const tempFileName = path.join(workingDir, name);
functions.logger.log("Downloading tmp file", tempFileName);
await file.download({destination: tempFileName});
jszip.folder(fileDir).file(name, fs.readFileSync(tempFileName));
}
const content = await jszip.generateAsync({
type: "nodebuffer",
compression: "DEFLATE",
compressionOptions: { level: 9 }
});
functions.logger.log("Saving zip file", localZipPath);
fs.writeFileSync(localZipPath, content);
// -------------------------------------------------------------------
// UPLOAD ZIP
// -------------------------------------------------------------------
functions.logger.log("Uploading zip to storage at", remoteZipPath);
const uploadResponse = await bucket
.upload(path.resolve(localZipPath), {destination: remoteZipPath});
// -------------------------------------------------------------------
// GET SIGNED URL FOR ZIP AND STORE IT IN DB
// -------------------------------------------------------------------
functions.logger.log("Getting signed URLs.");
const signedResult = await uploadResponse[0].getSignedUrl({
action: "read",
expires: "03-01-2500",
});
const signedUrl = signedResult[0];
functions.logger.log("Storing signed URL in db", signedUrl);
// Stores the signed URL under "zips/<WATCHED DIR>.signedUrl"
await db.collection("zips").doc(fileDir).set({
signedUrl: signedUrl,
}, {merge: true});
// -------------------------------------------------------------------
// CLEAN UP
// -------------------------------------------------------------------
functions.logger.log("Unlinking working dir", workingDir);
fs.rmSync(workingDir, {recursive: true, force: true});
functions.logger.log("DONE");
return null;
});

Related

Firebase Cloud Function - Container worker exceeded memory limit of 256 MiB with 258 MiB used after servicing 29 requests - GenerateThumbnail

I am developing an android app with firebase as a backend. Still in prototyping phase, single user, no heavy traffic at all. I have deployed (so far) 10 Cloud Function. So far no tweaking regarding memory (256MB) or other settings. One of them is
generateThumbnail from samples (slightly modified). As I am testing my app, new Images are uploaded to bucket, and thumbnails were created in same folder . . Basically, function worked as expected. However, yesterday, I got last log statement before error:
Container worker exceeded memory limit of 256 MiB with 258 MiB used after servicing 29 requests total. Consider setting a larger instance class and
and then actual error:
Function invocation was interrupted. Error: function terminated. Recommended action: inspect logs for termination reason. Additional troubleshooting documentation can be found at https://cloud.google.com/functions/docs/troubleshooting#logging
Again, I am currently single user, and function was triggered probably around 50 times so far. Obviously something is not working as expected.
this is the function:
exports.generateThumbnail = functions.storage.object().onFinalize(async (object) => {
// File and directory paths.
const filePath = object.name;
const contentType = object.contentType; // This is the image MIME type
const fileDir = path.dirname(filePath);
const fileName = path.basename(filePath);
const thumbFilePath = path.normalize(path.join(fileDir, `${THUMB_PREFIX}${fileName}`));
const tempLocalFile = path.join(os.tmpdir(), filePath);
const tempLocalDir = path.dirname(tempLocalFile);
const tempLocalThumbFile = path.join(os.tmpdir(), thumbFilePath);
//foldername in docId from pozes-test collection
const folderName = path.basename(fileDir)
const docIdFromFolderName = path.basename(fileDir)
// Exit if this is triggered on a file that is not an image.
if (!contentType.startsWith('image/')) {
return functions.logger.log('This is not an image.');
}
// Exit if the image is already a thumbnail.
if (fileName.startsWith(THUMB_PREFIX)) {
return functions.logger.log('Already a Thumbnail.');
}
// Cloud Storage files.
const bucket = admin.storage().bucket(object.bucket);
const file = bucket.file(filePath);
const thumbFile = bucket.file(thumbFilePath);
const metadata = {
contentType: contentType,
// To enable Client-side caching you can set the Cache-Control headers here. Uncomment below.
'Cache-Control': 'public,max-age=3600',
};
// Create the temp directory where the storage file will be downloaded.
await mkdirp(tempLocalDir)
// Download file from bucket.
await file.download({destination: tempLocalFile});
functions.logger.log('The file has been downloaded to', tempLocalFile);
// Generate a thumbnail using ImageMagick.
await spawn('convert', [tempLocalFile, '-thumbnail', `${THUMB_MAX_WIDTH}x${THUMB_MAX_HEIGHT}>`, tempLocalThumbFile], {capture: ['stdout', 'stderr']});
functions.logger.log('Thumbnail created at', tempLocalThumbFile);
// Uploading the Thumbnail.
await bucket.upload(tempLocalThumbFile, {destination: thumbFilePath, metadata: metadata});
functions.logger.log('Thumbnail uploaded to Storage at', thumbFilePath);
// Once the image has been uploaded delete the local files to free up disk space.
fs.unlinkSync(tempLocalFile);
fs.unlinkSync(tempLocalThumbFile);
// Get the Signed URLs for the thumbnail and original image.
const results = await Promise.all([
thumbFile.getSignedUrl({
action: 'read',
expires: '03-01-2500',
}),
file.getSignedUrl({
action: 'read',
expires: '03-01-2500',
}),
]);
functions.logger.log('Got Signed URLs.');
const thumbResult = results[0];
const originalResult = results[1];
const thumbFileUrl = thumbResult[0];
const fileUrl = originalResult[0];
// Add the URLs to the Database
if (fileName == "image_0") {
await admin.firestore().collection('testCollection').doc(docIdFromFolderName).update({thumbnail: thumbFileUrl});
return functions.logger.log('Thumbnail URLs saved to database.');
} else {
return ("fileName: " + fileName + " , nothing written to firestore")
}
This is from my package.json:
"dependencies": {
"firebase-admin": "^10.0.2",
"firebase-functions": "^3.22.0",
"googleapis": "^105.0.0",
"child-process-promise": "^2.2.1",
"mkdirp": "^1.0.3"
Can someone please explain what could be the reason this is happening. Why is this function exceeding memory of 256MB with so little traffic? ? Is this a working memory? Could it be that files are not getting deleted from tmp folder?
I have recreated the setup from your given code and data but I am not getting any kind of error like you are experiencing. I have also tried with a Image having large size more than 20MB but my memory consumption for the function is still hovering around 60MB/call.
And more specifically I have also tried to hammer the function with providing more than 40 times but still no error pops up for me.
I think it is best to create a github issue about this issue under the same github link you have provided generateThumbnail from samples, so the actual engineers behind the product will support you OR you can also try to contact firebase support

Generating a PDF when a document is created in Firebase Cloud Firestore

I'm developing an app that creates a PDF based on a web form.
I am currently attempting to use pdfmake to generate the PDFs based on a firestore document create trigger
import * as functions from 'firebase-functions';
const admin = require('firebase-admin);
admin.initializeApp();
const PdfPrinter = require('pdfmake');
const fs = require('fs');
export const createPDF = functions.firestore
.document('pdfs/{pdf}')
.onCreate(async (snap, context) => {
var pdfName = context.params.pdf;
var printer = new PdfPrinter();
var docDefinition = {
// Pdf Definitions
};
var options = {
// Pdf Options
};
var pdfDoc = printer.createPdfKitDocument(docDefinition, options);
pdfDoc.pipe(fs.createWriteStream('tempDoc.pdf'));
await pdfDoc.end();
// Upload to Firebase Storage
const bucket = admin.storage().bucket('myproject.appspot.com');
bucket.upload('tempDoc.pdf', {
destination: pdfName + '.pdf',
});
return fs.unlinkSync('document.pdf');
});
The trigger is called, however i get the error "Error: ENOENT: no such file or directory, stat 'document.pdf'"
I have tried it with the onCreate function being async and without.
Any help is greatly appreciated
It's not possible to write to any file location in Cloud Functions outside of /tmp. If your code needs to write a file, it should build paths off of os.tmpdir() as described in the documentation:
The only writeable part of the filesystem is the /tmp directory, which
you can use to store temporary files in a function instance. This is a
local disk mount point known as a "tmpfs" volume in which data written
to the volume is stored in memory. Note that it will consume memory
resources provisioned for the function.
The rest of the file system is read-only and accessible to the
function.

how to create refFromURL with admin privilege on cloud functions?

I want to have a reference to an image using its http URL when firestore update cloud function triggered so that i can take the url from change provide by onUpdate() function and use it to get a reference to the image on firebase storage and delete it.
In order to delete a file stored in Cloud Storage for Firebase from a Cloud Function you will need to create a File object based on:
The Bucket instance this file is attached to;
The name of the file,
and then call the delete() method
as detailed in the Node.js library documentation https://cloud.google.com/nodejs/docs/reference/storage/2.0.x/File.
Here is an example of code from the documentation:
const storage = new Storage();
const bucketName = 'Name of a bucket, e.g. my-bucket';
const filename = 'File to delete, e.g. file.txt';
// Deletes the file from the bucket
storage
.bucket(bucketName)
.file(filename)
.delete()
.then(() => {
console.log(`gs://${bucketName}/${filename} deleted.`);
})
.catch(err => {
console.error('ERROR:', err);
});
From your question, I understand that your app clients don't have the bucket and file names as such and only have a download URL (probably generated through getDownloadURL if it is a web app, or the similar method for other SDKs).
So the challenge is to derive the bucket and file names from a download URL.
If you look at the format of a download URL you will find that it is composed as follows:
https://firebasestorage.googleapis.com/v0/b/<your-project-id>.appspot.com/o/<your-bucket-name>%2F<your-file-name>?alt=media&token=<a-token-string>
So you just need to use a set of Javascript methods like indexOf(), substring() and/or slice() to extract the bucket and file names from the download URL.
Based on the above, your Cloud Function code could then look like:
const storage = new Storage();
.....
exports.deleteStorageFile = functions.firestore
.document('deletionRequests/{requestId}')
.onUpdate((change, context) => {
const newValue = change.after.data();
const downloadUrl = newValue.downloadUrl;
// extract the bucket and file names, for example through two dedicated Javascript functions
const fileBucket = getFileBucket(downloadUrl);
const fileName = getFileName(downloadUrl);
return storage
.bucket(fileBucket)
.file(fileName)
.delete()
});

How to set the destination for file.download() in Google Cloud Storage?

The Google Cloud Storage documentation for download() suggests that a destination folder can be specified:
file.download({
destination: '/Users/me/Desktop/file-backup.txt'
}, function(err) {});
No matter what value I put in my file is always downloaded to Firebase Cloud Storage at the root level. This question says that the path can't have an initial slash but changing the example to
file.download({
destination: 'Users/me/Desktop/file-backup.txt'
}, function(err) {});
doesn't make a difference.
Changing the destination to
file.download({
destination: ".child('Test_Folder')",
})
resulted in an error message:
EROFS: read-only file system, open '.child('Test_Folder')'
What is the correct syntax for a Cloud Storage destination (folder and filename)?
Changing the bucket from myapp.appspot.com to myapp.appspot.com/Test_Folder resulted in an error message:
Cannot parse JSON response
Also, the example path appears to specify a location on a personal computer's hard drive. It seems odd to set up a Cloud Storage folder for Desktop. Does this imply that there's a way to specify a destination somewhere other than Cloud Storage?
Here's my code:
const functions = require('firebase-functions');
const admin = require('firebase-admin');
admin.initializeApp();
exports.Storage = functions.firestore.document('Storage_Value').onUpdate((change, context) => {
const {Storage} = require('#google-cloud/storage');
const storage = new Storage();
const bucket = storage.bucket('myapp.appspot.com');
bucket.upload('./hello_world.ogg')
.then(function(data) {
const file = data[0];
file.download({
destination: 'Test_Folder/hello_dog.ogg',
})
.then(function(data) {
const contents = data[0];
console.log("File uploaded.");
})
.catch(error => {
console.error(error);
});
})
.catch(error => {
console.error(error);
});
return 0;
});
According to the documentation:
The only writeable part of the filesystem is the /tmp directory, which
you can use to store temporary files in a function instance. This is a
local disk mount point known as a "tmpfs" volume in which data written
to the volume is stored in memory. Note that it will consume memory
resources provisioned for the function.
The rest of the file system is read-only and accessible to the
function.
You should use os.tmpdir() to get the best writable directory for the current runtime.
Thanks Doug, the code is working now:
exports.Storage = functions.firestore.document('Storage_Value').onUpdate((change, context) => {
const {Storage} = require('#google-cloud/storage');
const storage = new Storage();
const bucket = storage.bucket('myapp.appspot.com');
const options = {
destination: 'Test_Folder/hello_world.dog'
};
bucket.upload('hello_world.ogg', options)
.then(function(data) {
const file = data[0];
});
return 0;
});
The function gets the file hello_world.ogg from the functions folder of my project, then writes it to Test_Folder in my Firebase Cloud Storage, and changes the name of the file to hello_world.dog. I copied the download URL and audio file plays perfectly.
Yesterday I thought it seemed odd that writing a file to Cloud Storage was called download(), when upload() made more sense. :-)
You can download the files from Google Cloud Storage to your computer using the following code or command
Install python on your PC
Install GCS on your PC
pip install google-cloud-storage
kesaktopdi.appspot.com
Download .json file and save it in /home/login/ folder
Change your account
https://console.cloud.google.com/apis/credentials/serviceaccountkey?project=kesaktopdi
import os
ACCOUNT_ID='kesaktopdi'
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="/home/login/" + ACCOUNT_ID + ".json"
def download_blob(bucket_name, source_blob_name, destination_file_name):
storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name)
blob = bucket.blob(source_blob_name)
blob.download_to_filename(destination_file_name)
#print('Blob {} downloaded to {}.'.format(source_blob_name,destination_file_name))
download_blob(ACCOUNT_ID +'.appspot.com', #account link
'user.txt', #file location on the server
'/home/login/kesaktopdi.txt') #file storage on a computer
You can also download files from the Google Cloud Storage server to your computer using the following command.
file location on the server file storage on a computer
gsutil -m cp -r gs://kesaktopdi.appspot.com/text.txt /home/login
The program was created by the APIuz team https://t.me/apiuz

How to call refFromURL in Firebase Cloud Function

I'm storing references to files in Firebase Cloud Storage using URLs. In firebase client code, you can call firebase.storage().refFromURL(photo.image) to get the actual storage reference and do handy things like call delete with it. How do I accomplish the same thing in a cloud function (specifically a realtime database trigger)? I want to be able to clean up images after deleting the object that references them.
Following Bob Snider's answer, this is a little function (typescript) to extract file full path from URL.
export const getFileFromURL = (fileURL: string): Promise<any> => {
const fSlashes = fileURL.split('/');
const fQuery = fSlashes[fSlashes.length - 1].split('?');
const segments = fQuery[0].split('%2F');
const fileName = segments.join('/');
return fileName;
}
In a cloud function, to delete a file from storage you need the file's bucket name and file name (which includes the path). Those can be obtained on the client side from the storage reference. For example, a JS Storage Reference has properties bucket and fullPath. The string representation of a storage reference has format: gs://example-12345.appspot.com/path/to/file, where the bucket is example-12345.appspot.com and the file "name" is path/to/file.
In the example cloud function shown below, the client is expected to provide the bucket and filename as children of the trigger location. You could also write the URL string to the trigger location and then split it into bucket and filename components in the cloud function.
This code is based on the example in the Cloud Storage guide.
const functions = require('firebase-functions');
const gcs = require('#google-cloud/storage')();
const admin = require('firebase-admin');
admin.initializeApp(functions.config().firebase);
exports.deleteFile = functions.database.ref('/test').onWrite(event => {
const bucket = event.data.child('bucket').val();
const filename = event.data.child('filename').val();
console.log('bucket=', bucket, 'filename=', filename);
return gcs.bucket(bucket).file(filename).delete().then(() => {
console.log(`gs://${bucket}/${filename} deleted.`);
}).catch((err) => {
console.error('ERROR:', err);
});
});
Here is a one-liner.
const refFromURL = (URL) => decodeURIComponent(URL.split('/').pop().split('?')[0])
I've wrote code sample which I using instead refFromURL method from web-firebase in my functions project based on Bob Snyder answer.
function refFromUrl(gsLink) {
var fileEntryTemp = gsLink.file.replace("gs://", "")
var bucketName = fileEntryTemp.substring(0, fileEntryTemp.indexOf("/"));
var filename = gsLink.file.match("gs://" + bucketName + "/" + "(.*)")[1];
var gsReference = admin.storage().bucket().file(filename);
return gsReference;
}
Here is an example how I get a download link based on this ref:
var gsReference = refFromUrl(fileEntry);
gsReference.getSignedUrl({
action: 'read',
expires: '03-09-2491'
}).then(function (url) {
console.log(url);
response.send(url);
}).catch(function (error) {
});
Hope this will save time for somebody
For complicated actions on your database from cloud functions you could use Admin SDK https://firebase.google.com/docs/database/admin/startFor the usage of Cloud Storage in Cloud Function check this out https://firebase.google.com/docs/functions/gcp-storage-eventsCloud Functions may not provide the same capability as client since Cloud Functions is beta for now and people are still working on it.

Resources