I have a number of firebase cloud functions. I try to set 1 of them to warm like this:
export const regionalFunctions = functions.region(cloudFunctionRegion);
export const manageDocs = regionalFunctions.runWith({ timeoutSeconds: 9 * 60, memory: '256MB' as '256MB', minInstances: 1 }).https.onCall(async (data, context) => {
but when I try to deploy the code it looks like it sets minInstances to 1 for all my cloud functions. Output from firebase deploy:
functions: The following functions have reserved minimum instances. This will reduce the frequency of cold starts but increases the minimum cost. You will be charged for the memory allocation and a fraction of the CPU allocation of instances while they are idle.
addFromCloudService(europe-west1): 1 instances, 256MB of memory each
mailChimp(europe-west1): 1 instances, 2GB of memory each
manageDocs(europe-west1): 1 instances, 256MB of memory each
onCreate(europe-west1): 1 instances, 2GB of memory each
onDocumentCreate(europe-west1): 1 instances, 2GB of memory each
onDocumentDelete(europe-west1): 1 instances, 2GB of memory each
onFinalize(europe-west1): 1 instances, 2GB of memory each
onUserAdded(europe-west1): 1 instances, 2GB of memory each
onUserDelete(europe-west1): 1 instances, 2GB of memory each
paddleUpgrade(europe-west1): 1 instances, 2GB of memory each
paddleWebhook(europe-west1): 1 instances, 2GB of memory each
splitDocument(europe-west1): 1 instances, 2GB of memory each
With these options, your minimum bill will be $194.12 in a 30-day month
? Would you like to proceed with deployment? No
Why is it doing this, when I have only set minInstances for the function manageDocs? How can I set minInstances for only 1 function?
Related
I'm using Airflow through Cloud Composer (Image: composer-2.0.29-airflow-2.3.3). I have defined 5 DAGS that run concurrently with 22 tasks run concurrently (max) distributed among the 5 DAGS. These DAGS are in the default-pool with default number of slots set to 128.
My composer instance has:
1 Scheduler: 0.5 vCPUs, 1.875 GB memory, 1 GB storage
Worker: 0.5 vCPUs, 1.875 GB memory, 1 GB storage
Autoscaling worker: from 1 to 3.
I would like to create different pools to separate my 5 systems. How do I define the number of slots in each pool? Suppose a pool has 1 DAG with 10 tasks (with 5/10 concurrent tasks). How many slots should I assign to each task?
DAG example:
task1.x is ingestion of JDBC table; while task2.x is update of the corresponding BigQuery table.
Thank you all!
Airflow pools are designed to avoid overwhelmed on external systems used by a group of tasks. For example, if you have some tasks in different dags which use a machine learning model API, a RDBMS, an API with quotas or any other system with limited scaling, you can use an Airflow pool to limit the number of parallel tasks which interact with this system.
In your case, you have two systems, JDBC database and BigQuery. You need to create just two pools, jdbc_pool and bigquery_pool, and assign all the tasks (form all the dags) which interact with the jdbc table to the first one and assign all the tasks which interact with biquery to the second one. For the slots, you can define them based on the performance of each system, and the computational weight of each task.
If you have a monitoring tool (prometheus, datadog, ...), you can run one of the tasks and watch the resources usage on your db, lets assume that it uses 10% of the resources, in this case you can create a pool with 8 slots to attend 80% of resources usage (you should avoid using 100% of the resources to avoid the problems when there is unexpected load). Then for the pool slots of each task:
if all the tasks are similar, you can use pool_slots=1 for all the tasks: max 8 parallel tasks with 80% of resources usage
if you have some tasks which are more complicated than the task you have tested (they use more than 10% of the db resources), you can use a higher value for pool_slots for these tasks based on the resources usage: assume there is a task which consumes 20% of the resources, you can use pool_slots=2 only for this tasks and keep 1 for the others, in this case you can have 8 parallel simple tasks or 6 parallel simple tasks with this heavy task with 80% of resources usage in the two cases.
For bigquery_pool, you need to check what are the quotas, but I think you can use a high value without any problem where it is a very scalable serverless DWH.
If you just want to limit the number of executed tasks in each worker to avoid OOM problem for ex, you can set the worker concurrency conf.
And if you want to limit the number of executed tasks in the whole Airflow server, you can set the parallelism conf.
I am considering which process launcher, between mpirun and srun, is better at optimizing the resources. Let's say one compute node in a cluster has 16 cores in total and I have a job I want to run using 10 processes.
If I launch it using mpirun -n10, will it be able to detect that my request has less number of cores than what's available in each node and will automatically assign all 10 cores from a single node? Unlike srun that has -N <number> to specify the number of nodes, mpirun doesn't seem to have such a flag. I am thinking that running all processes in one node can reduce communication time.
In the example above let's further assume that each node has 2 CPUs and the cores are distributed equally, so 8 cores/CPU and the specification say that there is 48 GB memory per node (or 24 GB/CPU or 3 GB/core). And suppose that each spawned process in my job requires 2.5 GB, so all processes will use up 25 GB. When does one say that a program exceeds the memory limit, is it when the total required memory:
exceeds per node memory (hence my program is good, 25 GB < 48 GB), or
exceeds per CPU memory (hence my program is bad, 25 GB > 24 GB), or
when the memory per process exceeds per core memory (hence my program is good, 2.5 GB < 3 GB)?
mpirun has no information about the cluster resource. It will not request the resources ; you must first request an allocation, with typically sbatch, or salloc and then Slurm will setup the environment so that mpirun knows on which node(s) to start processes. So you must have a look at the sbatch and salloc options to create a request that matches your needs. By default, Slurm will try to 'pack' jobs on the minimum number of nodes.
srun can also work in an allocation created by sbatch or salloc, but it can also do the request by itself.
I have a topology running on aws. I use M3 xlarge machines with 15GB ram, 8 supervisors. My topology is simple, I read from
kafka spout -> [db o/p1] -> [db o/p2] -> [dynamo fetch] -> [dynamo write & kafka write] kafka
db o/ps are conditional. with latency around 100 - 150 ms.
But I have never been able to achieve a throughput of more than 300 msgs/sec.
What configuration changes are to be made so that I can get a throughput of more than 3k msgs/sec?
dynamo fetch bolt execute latency is around 150 - 220ms
and dynamo read bolt execute latency is also around this number.
four bolts with parallelism 90 each and one spout with parallelism 30 (30 kafka partitions)
overall latency is greater than 4 secs.
topology.message.timeout.secs: 600
worker.childopts: "-Xmx5120m
no. of worker ports per machine : 2
no of workers : 6
no of threads : 414
executor send buffer size 16384
executor receive buffer size 16384
transfer buffer size: 34
no of ackers: 24
Looking at the console snapshot I see...
1) The overall latency for the Spout is much greater than the sum of the execute latencies of the bolts, which implies that there's a backlog on one of the streams, and
2) The capacity for SEBolt is much higher that that of the other bolts, implying that Storm feels the need to run that bolt more than the others
So I think your bottleneck is the SEBolt. Look into increasing the parallelism hint on that one. If the total number of tasks is getting too high, reduce the parallelism hint for the other bolts to offset the increase for SEBolt.
We are testing standard EBS volume, EBS volume with encryption on EBS optimized m3.xlarge EC2 instance.
While analyzing the test results, we came to know that
EBS volume with encryption is taking lesser time during read, write, read/write operations as compared to EBS without encryption.
I think there will be an effect of latency on encrypted EBS volume because of extra encryption overhead on every I/O request.
What will be the appropriate reason why EBS encrypted volumes are faster than normal EBS volumes??
Expected results should be that EBS should yield better results that Encrypted EEBS.
Results :
Encrpted EBS results:
sysbench 0.4.12: multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 8
Initializing random number generator from timer.
Extra file open flags: 16384
8 files, 512Mb each
4Gb total file size
Block size 16Kb
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing sequential write (creation) test
Threads started!
Done.
Operations performed: 0 Read, 262144 Write, 8 Other = 262152 Total
Read 0b Written 4Gb Total transferred 4Gb (11.018Mb/sec)
705.12 Requests/sec executed
Test execution summary:
total time: 371.7713s
total number of events: 262144
total time taken by event execution: 2973.6874
per-request statistics:
min: 1.06ms
avg: 11.34ms
max: 3461.45ms
approx. 95 percentile: 1.72ms
EBS results:
sysbench 0.4.12: multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 8
Initializing random number generator from timer.
Extra file open flags: 16384
8 files, 512Mb each
4Gb total file size
Block size 16Kb
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing sequential write (creation) test
Threads started!
Done.
Operations performed: 0 Read, 262144 Write, 8 Other = 262152 Total
Read 0b Written 4Gb Total transferred 4Gb (6.3501Mb/sec)
406.41 Requests/sec executed
Test execution summary:
total time: 645.0251s
total number of events: 262144
total time taken by event execution: 5159.7466
per-request statistics:
min: 0.88ms
avg: 19.68ms
max: 5700.71ms
approx. 95 percentile: 6.31ms
please help me resolve this issue.
That's certainly unexpected conceptually and also confirmed by Amazon EBS Encryption:
[...] and you can expect the same provisioned IOPS performance on encrypted volumes as you would with unencrypted volumes with a minimal effect on latency. You can access encrypted Amazon EBS volumes the same way you access existing volumes; encryption and decryption are handled transparently and they require no additional action from you, your EC2 instance, or your application. [...] [emphasis mine]
Amazon EBS Volume Performance provides more details on EBS performance in general - from that angle, but pure speculation, maybe the use of encryption implies some default Pre-Warming Amazon EBS Volumes:
When you create any new EBS volume (General Purpose (SSD), Provisioned IOPS (SSD), or Magnetic) or restore a volume from a snapshot, the back-end storage blocks are allocated to you immediately. However, the first time you access a block of storage, it must be either wiped clean (for new volumes) or instantiated from its snapshot (for restored volumes) before you can access the block. This preliminary action takes time and can cause a 5 to 50 percent loss of IOPS for your volume the first time each block is accessed. [...]
Either way, I suggest to rerun the benchmark after pre-warming both new EBS volumes, in case you haven't done so already.
The problem is that IIS worker consumes to much memory. After inspecting the w3wp process with VMMAP I noticed that the bigest component of the Private WS is the Managed Heap, the GC memory.
Further, I inspected the w3wp process using Performance Monitoring, and the results were as follows:
# Bytes in All Heaps : 32MB
# Gen 0 Collections : 4
# Gen 1 Collections : 3
# Gen 2 Collections : 2
Gen 0 Heap Size 570MB
Gen 1 Heap Size 5MB
Gen 2 Heap Size 26MB
Active Sessions : 4
Gen 0 heap size is increased with every new session. The peak is when I have 4 active session(~570MB). Than when I have 6 session it decreases to ~250MB and than increases again until the application pool is recycled(~8-9 active sessions).
As I know the Gen 0 heap size has to be very small(comparable with the L2 Cache) and this is the size that triggers the GC to run Gen 0 GCs.
Why is the Gen 0 heap size so big?
I have the following enviroment:
IIS 6.0
The application is Asp.Net WebForms
Application Pool is restricted to 700Mb, and it gets recycled when
I have ~8-9 active sessions, so all session are lost.
.Net Framework v4.0.3
64 bit version of w3wp worker.
I also inspected the application memory using CLR profiler and the
number of Bytes in all heaps are 10-60 mb depending on number of active sessions.
Thank you!
http://msdn.microsoft.com/en-us/library/ee817660.aspx
Use WinDbg or any commercial .NET memory profiler, you should be able to review what are the objects in the heap and whether they should be there.
Common causes are string manipulation without StringBuilder and big objects in session such as DataTable.
Find the exact cause in your case and remedy it.