Incredibuild not using all CPU resources , why? - cpu-usage

We're using Incredibuild 5 coordinator and clients.
When somebody starts a rebuild in VS2010 with Incredibuild addon, the build PC's CPU usage is not reaching 100%, just using 5-30% all time.
The build PC's clients priority settings is set to high and all cores are allocated to build processes.
In build coordinator we using these settings:
Build Prio: high
Assignment prio: high
Remote Tasks: High
HyperThread enabled and set to max.
How can I reach the maximum CPU usage at build clients?

These are actually good results, it means that IncrediBuild distribute your workload to other nodes very efficiently.
It is a good thing that the build machine is not too busy, the build machine is responsible for synchronizing all the output received by the remote nodes.
If you'd like to test your performance by increasing the workload on the build machine, you can increase the value that tells IncrediBuild how many cores you have on your build machine. If your machine is 8-cores hyper-threaded, IncrediBuild will only execute 8 tasks (processes) in parallel, you can instruct IncrediBuild to execute more tasks in parallel on your machine by changing the number of cores you tell IncrediBuild your machine has:
Right click the IncrediBuild tray icon->Agent Settings->Agent->CPU Utilization-> choose "user defined" in the combo box and change the value that defines the number of cores your machine has.
Pay attention, in order for IncrediBuild to consider your machine as a 16 cores machine, you should have an appropriate license. You can ask for a temporary license from your IncrediBuild account manager.
Disclaimer - I work for IncrediBuild

Related

Has anyone used DEX agent in TOSCA and what are the benefits?

Anyone know the benefits/drawbacks of using DEX agent in tosca compared to the execution list.
From my research, it seems that DEX agent has better performance as it is running the testing on multiple VMs.
Execution list is pushed on to the Dex execution queue. So I assume you mean disadvantages/benefits of running on DEX vs local.
When you execute your tests, Tosca takes control of your mouse and keyboard, so it can interact with the system under test. Consequently, users can't work on this machine for the duration of the test run. And if you have large test sets, it simply takes too long to run all of them on one machine.
With Tosca Distributed Execution, you can distribute your tests across all available computing resources, such as computers in your network, virtual machines. This speeds up large test runs and leaves user machines unblocked.

How can I configure yarn cluster for parallel execution of Applications?

When I run spark job on yarn cluster, Applications are running in queue. So how can I run in parallel number of Applications?.
I suppose your YARN scheduler option is set to FIFO. Please change it to FAIR or capacity scheduler.Fair Scheduler attempts to allocate resources so that all running applications get the same share of resources.
The Capacity Scheduler allows sharing of a Hadoop cluster along
organizational lines, whereby each organization is allocated a certain
capacity of the overall cluster. Each organization is set up with a
dedicated queue that is configured to use a given fraction of the
cluster capacity. Queues may be further divided in hierarchical
fashion, allowing each organization to share its cluster allowance
between different groups of users within the organization. Within a
queue, applications are scheduled using FIFO scheduling.
If you are using capacity scheduler then
In spark submit mention your queue --queue queueName
Please try to change this capacity scheduler property
yarn.scheduler.capacity.maximum-applications = any number
it will decide how many application will run parallely
By default, Spark will acquire all available resources when it launches a job.
You can limit the amount of resources consumed for each job via the spark-submit command.
Add the option "--conf spark.cores.max=1" to spark-submit. You can change the number of cores to suite your environment. For example if you have 100 total cores, you might limit a single job to 25 cores or 5 cores, etc.
You can also limit the amount of memory consumed: --conf spark.executor.memory=4g
You can change settings via spark-submit or in the file conf/spark-defaults.conf. Here is a link with documentation:
Spark Configuration

Increase RAM usage for IIS server

I am running a large scale ERP system on the following server configuration. The application is developed using AngularJS and ASP.NET 4.5
Dell PowerEdge R730 (Quad Core 2.7 Ghz, 32 GB RAM, 5 x 500 GB Hard disk, RAID5 configured) Software: Host OS is VMWare ESXi 6.0 Two VMs run on VMWare ESXi .. one is Windows Server 2012 R2 with 16 GB memory allocated ... this contains IIS 8 server with my application code Another VM is also Windows Server 2012 R2 with SQL Server 2012 and 16 GB memory allocated .... this just contains my application database.
You see, I separated the application server and database server for load balancing purposes.
My application contains a registration module where the load is expected to be very very high (around 10,000 visitors over 10 minutes)
To support this volume of requests, I have done the following in my IIS server -> increase request queue in application pool length to 5000 -> enable output caching for aspx files -> enable static and dynamic compression in IIS server -> set virtual memory limit and private memory limit of each application pool to 0 -> Increase maximum worker process of each application pool to 6
I then used gatling to run load testing on my application. I injected 500 users at once into my registration module.
However, I see that only 40% / 45% of my RAM is being used. Each worker process is using only a maximum amount of 130 MB or so.
And gatling is reporting that around 20% of my requests are getting 403 error, and more than 60% of all HTTP requests have a response time greater than 20 seconds.
A single user makes 380 HTTP requests over a span of around 3 minutes. The total data transfer of a single user is 1.5 MB. I have simulated 500 users like this.
Is there anything missing in my server tuning? I have already tuned my application code to minimize memory leaks, increase timeouts, and so on.
There is a known issue with the newest generation of PowerEdge servers that use the Broadcom Network Chip set. Apparently, the "VM" feature for the network is broken which results in horrible network latency on VMs.
Head to Dell and get the most recent firmware and Windows drivers for the Broadcom.
Head to VMWare Downloads and get the latest Broadcom Driver
As for the worker process settings, for maximum performance, you should consider running the same number of worker processes as there are NUMA nodes, so that there is 1:1 affinity between the worker processes and NUMA nodes. This can be done by setting "Maximum Worker Processes" AppPool setting to 0. In this setting, IIS determines how many NUMA nodes are available on the hardware and starts the same number of worker processes.
I guess the 1 caveat to the answer you received would be if your server isn't NUMA aware/uses symmetric processing, you won't see those IIS options under CPU, but the above poster seems to know a good bit more than I do about the machine. Sorry I don't have enough street cred to add this as a comment. As far as IIS you may also want to make sure your app pool doesn't use default recycle conditions and pick a time like midnight for recycle. If you have root level settings applied the default app pool recycling at 29 hours may also trigger garbage collection against your child pool/causing delays even in concurrent gc where it sounds like you may benefit a bit from Gcserver=true. Pretty tough to assess that though.
Has your sql server been optimized for that type of workload? If your data isn't paramount you could squeeze faster execution times with delayed durability, then assess queries that are returning too much info for async io wait types. In general there's not enough here to really assess for sql optimizations, but if not configured right (size/growth options) you could be hitting a lot of timeouts due to growth, vlf fragmentation, etc.

Openstack: How to decide hardware capacity?

I'm reading some OpenStack material recently, but didn't get a chance to try yet. I got the sense that Openstack could management a large number of virtual machines via API or dashboard interface. User could easily create/start virtual machines.
Then I come out a confusion. As the underlying computer hardware might vary, some computer maybe only able to host one virtual machine, some maybe ten. When user start a virtual machine, does user manually or Openstack automatically designate a hardware computer to host the virtual machine? In either case, how to decide the hardware computer's capacity? Does Openstack provide the functionality to set capacity attribute of hardware computer?
When you run OpenStack, each physical machine (which OpenStack calls compute hosts) will periodically report how many CPUs it has and how much RAM it has, as well as how many CPUs and how much RAM have been allocated to virtual machines that are currently running.
The OpenStack scheduler uses this information to determine which compute host to run a VM on. First, it checks to see if a host has enough CPUs (by applying the CoreFilter) and enough RAM (by applying the RamFilter). Compute hosts that don't have enough CPUs or RAM available won't even be considered.
Once it has a set of candidate hosts that have enough CPU and RAM, the scheduler needs to pick one of them. By default, the scheduler will use a "spread-first" strategy, allocating VMs to machines that have the most amount of CPU/RAM that isn't currently allocated to VM. It's possible to change this strategy to a "fill-first" behavior, so that the compute host with the least amount of free resources will get allocated first. This is configured by setting the nova.scheduler.least_cost.compute_fill_first_cost_fn parameter.
For more information, see the chapter on scheduling in the OpenStack Compute Admin guide.

NonStop ODBC: how the connections (ODBC servers) are assigned to CPUs?

We have an ODBC pool running on a NonStop server. The pool is connected to SQL/MX.
This pool is used by a few external Java applications, each of which has an JDBC pool connected to ODBC pool (e.g. 14 connections per application).
With time (after a few application recycles) we see an imbalance between CPUs -- some have 8 ODBC processes running, some only 5. That leads to CPU time imbalance too.
Up to this point we assumed that a CPU is assigned to ODBC process in round-robin fashion. That would maintain the number of ODBC processes more or less equally distributed. It's not the case though.
Is there any information on how ODBC pool decided which CPU to choose for every new allocated process? Does it look at CPU load? Available memory? Something else?
Sadly, even HP's own people (available to us, that is) couldn't answer those questions with certainty. :-(
And in fact connections are assigned to CPUs in round-robin fashion. But if one of the consumers (with its own pool) is restarted for any reason, the connections will be released on the CPUs where they were allocated (obviously), but new ones will be allocated on the next CPU according to round-robin algorithm. Thus some CPUs will become less busy, and some more. Thus imbalance.

Resources