I have setup the firebase local emulator and created a project with cloud functions and firestore. I also exported my production data to the project which I import into the emulator. (the collection in question is about 5000 documents ranging from 5kb to 200kb in size)
My goal was to benchmark query performance, so I wrote a query and ran it a number of times to get an average execution time of 130 ms. I then wrote a different query to get an average execution time of 20 ms. I did not import any indexes (the admin sdk doesn't seem to require them when querying the emulator like it does what querying production).
I also observed the first query always takes significantly longer.
My question is basically, how does this difference in execution time translation to the production environment if at all. Assuming the same queries are run against the same data, and ignoring network latency to/from the client. Will the second query run about ~110ms faster? Or will the difference be less/more?
Also why does the first query take longer, and is there any way to use that fact to improve performance in real world usage?
how does this difference in execution time translation to the production environment if at all.
The observed performance of the emulator has little to nothing to do with the performance of the actual cloud hosted product. It's not the same code, and it's not running on the same set of computing resources.
Firestore is massively scalable and shards your data across many computing resources, all of which work together to service a query and ensure that it performs at any scale. As you can imagine, an emulator running on your one local machine is nowhere near that. They are simply not comparable.
The emulator is meant to ease local development without requiring the use of paid cloud resources to get your job done. It's not meant for any other purpose.
Related
I am developing a new React web site using Firebase hosting and firebase functions.
I am using a MySQL database (SQL required for heavy data reporting) in GCP Cloud Sql and GCP Secret Manager to house the database username/password.
The Firebase functions are used to pull data from the database and send the results back to the React app.
In my local emulator everything works correctly and its responsive.
When its deployed to Firebase Im noticing the 1st and sometimes the 2nd request to a function takes about 6 seconds to respond. After that they respond less than 1 sec. For the slow responses I can see in the logs the database pool is initialized.
So the slow responses are the first hit to the instance. Im assuming in my case two instances are being created.
Note that the functions that do not require a database respond quickly regardless of it being the 1st or 2nd call.
After about 15 minutes of not using a service I have the same problem. Im assuming the instances are being reclaimed and a new instance is being created.
The problem is each function will have its own independent db pool so each function will initially provide a slow response (maybe twice for the second call).
The site will see low traffic meaning most users will experience this slow response.
By removing the reference to Secret Manager and hard coding username/password the response has dropped to less than 3 seconds. But this is still not acceptable.
Is there a way to:
Increase the time that a function is reclaimed if not used?
Tag an instance that it should not be reclaimed?
Is there a way to create a global db pool that does not get shutdown between recycles?
Is there an approach to work with db connections in Firebase Functions to avoid reinit of the db pool?
Is this the nature of functions and Im limited to this behavior?
Since I am in early development, would moving to AppEngine/Node.js (the Flexible Plan) resolve recycling issues?
First of all, the issues you have been experiencing with the 1st and the 2nd requests taking the longest time are called cold starts.
This totally makes sense because new instances are spun up. You may have a cold start when:
Your function has been deployed but not yet triggered.
Your function has been idle(not processing requests) enough that it has been recycled for resources.
Your function is auto-scaling to handle capacity and creating new instances.
I understand that your five questions are intended to work around the issue of Cloud Functions recycling the instances.
The straight answer from questions 1 to 4 is No because Cloud Functions implement the serverless paradigm.
This means that one function invocation should not rely on in-memory state(database pool) set by a previous invocation.
Now this does not mean that you cannot improve the cold start boot time.
Generally the number one contributor of cold start boot time is the number of dependencies.
This video from the Google Cloud Tech channel exactly goes over the issue you have been experiencing and describes in more detail the practices implemented to tune up Cloud Functions.
If after going through the best practices from the video, your coldstart shows unacceptable values, then, as you have already suggested, you would need to use a product that allows you to have a minimum set of instances spun up like App Engine Standard.
You can further improve the readiness of the App Engine Standard instances by later on implementing warm up requests.
Warmup requests load your app's code into a new instance before any live requests reach that instance. The last document talks about loading requests which is similar to cold starts in which it is the time when your app's code is being loaded to a newly created instance.
I hope you find this useful.
I developed an Android application where I use Firebase as my main service for storing data, authenticating users, storage, and more.
I recently went deeper into the service and wanted to see the API usage in my Google Cloud Platform.
In order to do so, I navigated to https://console.cloud.google.com/ to see what it has to show inside APIs and Services:
And by checking what might cause it I got:
Can someone please explain what is the meaning of "Latency" and what could be the reason that specifically this service has so much higher Latency value compared to the other API's?
Does this value have any impact on my application such as slowing the response or something else? If yes, are there any guidelines to lower this value?
Thank you
Latency is the "delay" until an operation starts. Cloud Functions, in particular, have to actually load and start a container (if they have paused), or at least load from memory (it depends on how often the function is called).
Can this affect your client? Holy heck, yes. but what you can do about it is a significant study in and of itself. For Cloud Functions, the biggest latency comes from starting the "container" (assuming cold-start, which your low Request count suggests) - it will have to load and initialize modules before calling your code. Same issue applies here as for browser code: tight code, minimal module loads, etc.
Some latency is to be expected from Cloud Functions (I'm pretty sure a couple hundred ms is typical). Design your client UX accordingly. Cloud Functions real power isn't instantaneous response; rather it's the compute power available IN PARALLEL with browser operations, and the ability to spin up multiple instances to respond to multiple browser sessions. Use it accordingly.
Listen and Write are long lived streams. In this case a 8 minute latency should be interpreted as a connection that was open for 8 minutes. Individual queries or write operations on those streams will be faster (milliseconds).
I am trying to make a call to my web api using Refit in a Xamarin Forms app and it seems to work well in the emulator (2 - 5 secs) but crashes most times on a real android phone or takes quite long to return on rare occasions. I am using a basic 5 DTU SQL database on Azure. Could this be the reason
I have tried to make 2 calls from the device and the spike in the chart above is a result of it. The first query takes a bit of time and once it returns (I managed to get a reply this time) I make a second call which returns too after a
delay. Do I need to use indices at all..??
I am using a basic 5 DTU SQL database on Azure. Could this be the reason
Yes- it is entirely possible that your issues will be resolved by changing your tier of Database; the graph shows that your query hits a lot of that 5DTU limit. If your query is complex or your dataset is sizable, you may get better performance out of an S-tier database. (Try an S0 or an S1 and check if you see better performance.)
Do I need to use indices at all..??
Depending on the type of queries you are running, an index may have a beneficial impact on the performance of your database. (You definitely ought to investigate whether you can optimize performance this way.) You might be interested in some of the features of Azure SQL Database that can help you examine (and improve) query performance-SQL Database Advisor &SQL Query Store.
Problem
First call to Firebase from a server takes ~15 - 20X longer than subsequent calls. While this is not a problem for a conventional server calling upon Firebase, it may cause issues with a server-less architecture leveraging Amazon Lambda/ Google Cloud Functions.
Questions
Why is the first call so much slower? Is it due to authentication?
Are there any workarounds?
Is it practical to do some user-initiated computation of data on Firebase DB using Amazon Lambda/ Google Cloud Functions and return the results to the client within 1 - 2 seconds?
Context
I am planning on using a server-less architecture with Firebase as the repository of my data and Amazon Lambda/ Cloud Functions augmenting Firebase with some server-side computation, e.g. searching for other users. I intend to trigger the functions via HTTP requests from my client.
One concern that I had was the large time taken by the first call to Firebase from the server. While testing some server-side code on my laptop, the first listener returns back in 6s! Subsequent calls return in 300 - 400ms. The dataset is very small (2 - 3 key value pairs) and I also tested by swapping the observers.
In comparison, a call to the Google Maps API from my laptop takes about 400ms to return.
I realise that response times would be considerably faster from a server. Still a factor of 15 - 20X on the first call is disconcerting.
TL;DR: You're noticing something that's known/expected, though we will shave the penalty down as GA approaches. Some improvements will come sooner than later.
Cloud Functions for Firebase team member here. We are able to offer Cloud Functions at a competitive price by "scaling to zero" (shutting down all instances) after sustained lack of load. When a request comes in and you have no available instances, Cloud Functions creates one for you on demand. This is obviously slower than hitting an active server and is something we call a "cold start". Cold starts are part of the reality of "serverless" architecture, but we have many people working on ways to reduce the penalty dramatically.
There's another case that I've recently started calling a "lukewarm" start. After a deploy, the Cloud Function instance has been created, but your application still has warmup work to do like establishing a connection to the Firebase Realtime Database. Part of this is authentication, as you've suggested. We have detected a slowdown here that will be fixed next week. After that, you'll still have to pay for SSL + Firebase handshakes. Try measuring this latency; it's not clear how much we'll be able to circumvent it.
Thanks Frank!! Read up on how firebase establishes web-socket connections.
To add to Frank's answer, the initial handshake causes the delay in the first pull. The approach drastically speeds up subsequent data pulls. While testing on an Amazon Lambda instance running on a US-west coast server. The response times were: 1) First pull: 1.6 - 2.3s 2) Subsequent pulls: 60 - 100ms. The dataset itself was extremely small, so one can assume that these time periods are simply for server-to-server communications. Takeaways:
Amazon Lambda instances can be triggered via an API gateway for non time-critical computations but is not the ideal solution for real-time computations on Firebase data such as returning search results (unless there is a way to guarantee persisting the handshake over instances - not from what I've read)
For time critical computations, I am going for running EC2/ GAE instances leveraging Firebase Queue. https://github.com/firebase/firebase-queue. The approach is more involved than firing lambda instances, but would return results faster (because of avoiding the handshake for every task).
Based on this answer, it looks like the meteor server keeps an in-memory copy of the cache for each connected client. My understanding is that it gets used in order to avoid sending multiple copies of data when dealing with overlapping subscriptions on a client.
The relevant part of the linked answer (emphasis is mine):
The merge box: The job of the merge box is to combine the results (added, changed and removed calls) of all of a client's active publish functions into a single data stream. There is one merge box for each connected client. It holds a complete copy of the client's minimongo cache.
Assuming that answer is still accurate in the current version of meteor, couldn't that create a huge waste of memory on the server as the number of users increases?
As an off-the-cuff calculation, if an app had about a 100kB cache per client, then 10,000 concurrent users would use up 1GB of memory on the server, and 100,000 users a whopping 10GB! This would be true even if each client was looking at almost identical data. It seems plausible for an app use much more data than that per client, which would further exacerbate the problem.
Does this problem exist in the current version of Meteor? If so, what techniques can be used to limit the amount of memory the server needs to use to manage all the client subscriptions?
Take a look at this post by Arunoda at his meteorhacks.com blog:
http://meteorhacks.com/making-meteor-500-faster-with-smart-collections.html
which talks about his Smart Collections page:
http://meteorhacks.com/introducing-smart-collections.html
He created an alternative Collection stack which has succeeded in it's goals for speed, efficiency (memory & cpu) and scalability (you can see a graphed comparison in the post). Admittedly in his tests RAM usage was negligent with both Collection types, although the way he's implemented things there should be a very obvious difference with the type of use case you mentioned.
Also, you can see in this post on meteor-core:
https://groups.google.com/d/msg/meteor-core/jG1KLObX1bM/39aP4kxqWZUJ
that the Meteor developers are aware of his work and are cooperating in implementing some of the improvements into Meteor itself (but until then his smart package works great).
Important note! Smart collections relies on access to the Mongo Oplog. This is easy if you're running on your own machine or hosted infrastructure. If you're using a cloud based database, this option might not be available, or if it is, will cost a lot more than the smaller packages.