I have a DC/OS mesos cluster with 1 master, 2 private agents and 1 public agent.
Each private agent offers 4 CPU and 14.7 GB memory.
The problem is: How could I allocate a service that needs 8 CPUs and 20 GB?
Mesos tries to allocate the service in one node, so I would like to join the slaves resources to run this service. (Jupyterlab for more info)
Thanks!
That’s simply not possible. You cannot create services that use more resources than at least one single agent can offer.
Please check e.g. http://datastrophic.io/resource-allocation-in-mesos-dominant-resource-fairness-explained/ to get an idea on how resource allocation works with Mesos.
Related
I am using the preview feature of configuring an AKS cluster via Arc in Azure Machine Learning Studio and attempting to submit a job for training however it gets stuck in Queued state with the following message:
Queue Information : Job is waiting for available resources, that required for 1 instance with 1.00 vCPU(s), 4.00 GB memory and 0 GPU(s). The best-fit compute can only provide 1.90 vCPU(s), 4.46 GB memory and 0 GPU(s). Please continue to wait or change to a smaller instance type
I am not too sure exactly what this is telling me because (aside from the grammar) the job requirements are LESS than what is available so why is it blocking? Also its telling me to change to a smaller instance type, which I did and it still gave me the same.
Anyone come across this or know how to get past it?
Create and attach an Azure Kubernetes Service cluster limitations: https://learn.microsoft.com/en-us/azure/machine-learning/how-to-create-attach-kubernetes?tabs=python#limitations
When I run spark job on yarn cluster, Applications are running in queue. So how can I run in parallel number of Applications?.
I suppose your YARN scheduler option is set to FIFO. Please change it to FAIR or capacity scheduler.Fair Scheduler attempts to allocate resources so that all running applications get the same share of resources.
The Capacity Scheduler allows sharing of a Hadoop cluster along
organizational lines, whereby each organization is allocated a certain
capacity of the overall cluster. Each organization is set up with a
dedicated queue that is configured to use a given fraction of the
cluster capacity. Queues may be further divided in hierarchical
fashion, allowing each organization to share its cluster allowance
between different groups of users within the organization. Within a
queue, applications are scheduled using FIFO scheduling.
If you are using capacity scheduler then
In spark submit mention your queue --queue queueName
Please try to change this capacity scheduler property
yarn.scheduler.capacity.maximum-applications = any number
it will decide how many application will run parallely
By default, Spark will acquire all available resources when it launches a job.
You can limit the amount of resources consumed for each job via the spark-submit command.
Add the option "--conf spark.cores.max=1" to spark-submit. You can change the number of cores to suite your environment. For example if you have 100 total cores, you might limit a single job to 25 cores or 5 cores, etc.
You can also limit the amount of memory consumed: --conf spark.executor.memory=4g
You can change settings via spark-submit or in the file conf/spark-defaults.conf. Here is a link with documentation:
Spark Configuration
I have deployed 2 identical compute nodes in Openstack environment (Mitaka).
Each Compute node has 2 Physical CPU, 12 Cores each.
I would like to create a single VM which has have much processors as possible.
I don't want to oversubscribe between pCPU to vCPU, i.e. I would keep physical to virtual as 1:1 ratio.
However, it seems only allow me max. to create 24 vCPU in single VM even I have 48 vCPU in my resource pool (sum up by 2 compute nodes, each contribute 24 vCPU).
Anyone have an idea how to create more vCPU in my case?
You cannot create an instance that spans multiple compute nodes with OpenStack ... or with any open-source virtualization platform that I am aware of.
The proprietary vSMP product (vendor ScaleMP) can do this and there may be other products.
The other approach that you could take is to build a cluster consisting of multiple instances, and use a batch scheduler and / or some kind of message passing framework to perform computations spanning the cluster.
I am running a large scale ERP system on the following server configuration. The application is developed using AngularJS and ASP.NET 4.5
Dell PowerEdge R730 (Quad Core 2.7 Ghz, 32 GB RAM, 5 x 500 GB Hard disk, RAID5 configured) Software: Host OS is VMWare ESXi 6.0 Two VMs run on VMWare ESXi .. one is Windows Server 2012 R2 with 16 GB memory allocated ... this contains IIS 8 server with my application code Another VM is also Windows Server 2012 R2 with SQL Server 2012 and 16 GB memory allocated .... this just contains my application database.
You see, I separated the application server and database server for load balancing purposes.
My application contains a registration module where the load is expected to be very very high (around 10,000 visitors over 10 minutes)
To support this volume of requests, I have done the following in my IIS server -> increase request queue in application pool length to 5000 -> enable output caching for aspx files -> enable static and dynamic compression in IIS server -> set virtual memory limit and private memory limit of each application pool to 0 -> Increase maximum worker process of each application pool to 6
I then used gatling to run load testing on my application. I injected 500 users at once into my registration module.
However, I see that only 40% / 45% of my RAM is being used. Each worker process is using only a maximum amount of 130 MB or so.
And gatling is reporting that around 20% of my requests are getting 403 error, and more than 60% of all HTTP requests have a response time greater than 20 seconds.
A single user makes 380 HTTP requests over a span of around 3 minutes. The total data transfer of a single user is 1.5 MB. I have simulated 500 users like this.
Is there anything missing in my server tuning? I have already tuned my application code to minimize memory leaks, increase timeouts, and so on.
There is a known issue with the newest generation of PowerEdge servers that use the Broadcom Network Chip set. Apparently, the "VM" feature for the network is broken which results in horrible network latency on VMs.
Head to Dell and get the most recent firmware and Windows drivers for the Broadcom.
Head to VMWare Downloads and get the latest Broadcom Driver
As for the worker process settings, for maximum performance, you should consider running the same number of worker processes as there are NUMA nodes, so that there is 1:1 affinity between the worker processes and NUMA nodes. This can be done by setting "Maximum Worker Processes" AppPool setting to 0. In this setting, IIS determines how many NUMA nodes are available on the hardware and starts the same number of worker processes.
I guess the 1 caveat to the answer you received would be if your server isn't NUMA aware/uses symmetric processing, you won't see those IIS options under CPU, but the above poster seems to know a good bit more than I do about the machine. Sorry I don't have enough street cred to add this as a comment. As far as IIS you may also want to make sure your app pool doesn't use default recycle conditions and pick a time like midnight for recycle. If you have root level settings applied the default app pool recycling at 29 hours may also trigger garbage collection against your child pool/causing delays even in concurrent gc where it sounds like you may benefit a bit from Gcserver=true. Pretty tough to assess that though.
Has your sql server been optimized for that type of workload? If your data isn't paramount you could squeeze faster execution times with delayed durability, then assess queries that are returning too much info for async io wait types. In general there's not enough here to really assess for sql optimizations, but if not configured right (size/growth options) you could be hitting a lot of timeouts due to growth, vlf fragmentation, etc.
I'm totally new to openstack and after seeing tutorials running it on both single node and multi node(at least 3 consisting 1 controller 1 compute and 1 network node) i was wondering whats the diffrence and is there any advantages with multi nodes over single nodes ones?
Open stack system consists of lot services. If you are running all these in single node , then
there will be resource scarcity issue unless you have a machine with very high CPU,RAM etc. Another advantage of multi node configuration is failover. In 3 node config if one node is down then you can continue with 2 nodes (provided you have service replication). Better go for at least 3 node config which is recommended by openstack.
With Multinode configuration, you can achieve scale-out storage solution by adding more storage needs to your needs. Also, several compute nodes can be used to increase computation needs.