cloudstack No suitable hosts found under this Cluster - apache-cloudstack

When I try to start an instance through template,I get the following error messages:
2013-11-10 19:44:28,716 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (Job-Executor-5:job-19 = [ d070b5ba-f342-4252-9137-4d2c1b19eca6 ]) No suitable hosts found under this Cluster: 2
2013-11-10 19:44:28,718 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] (Job-Executor-5:job-19 = [ d070b5ba-f342-4252-9137-4d2c1b19eca6 ]) Could not find suitable Deployment Destination for this VM under any clusters, returning.
2013-11-10 19:44:28,718 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-5:job-19 = [ d070b5ba-f342-4252-9137-4d2c1b19eca6 ]) Searching all possible resources under this Zone: 1
2013-11-10 19:44:28,718 DEBUG [cloud.deploy.FirstFitPlanner] (Job-Executor-5:job-19 = [ d070b5ba-f342-4252-9137-4d2c1b19eca6 ]) Listing clusters in order of aggregate capacity, that have (atleast one host with) enough CPU and RAM capacity under this Zone: 1
I feel confused because I already have a host in cluster 2.
Can anyone give me some suggestions?Any reply will be appreciated!

You need to have a closer look at the log file to understand why the CloudStack is unable to place a VM on the host. The information will appear above the entries you have provided. There are a lot of issues that can cause this problem.
E.g. this blog entry walks through a configuration problem with XenServer
Another common issue is when using local storage, you need to create a new compute offering that uses local storage disks. The default compute offerings do not support local storage.
Updated: changed answer to take into account that your sample is from the log file.

Most probably you are running out of capacity or due to some other error cloudstack has added you host in the "Avoid List". It does that when the management server finds an error while deploying the instance and from the next time till the problem gets resolved, the host and the cluster will be part of avoid list and will be avoided in the subsequent deployments.
You need to find out the exact reason by monitoring the management server logs. Login to your management server and go to folder /var/log/cloudstack/management/.
Now run the command tail -f management-server.log
This will give you continuous output of the management server log, so that you know what exactly is happening at the moment.
Now do the operation on UI (e.g. try to add instance) and quickly monitor the running logs.
Abort the command when you find an exception in the log and monitor the log statements just above the exception.
Also as a standard practice, always have a habit of monitoring the management server logs and agent logs (on host - /var/log/cloudstack/agent/agent.log)

Related

Some BizTalk Receive Locations are disabled after server reboot

It is found that some BizTalk Receive Locations are disabled after server reboot (BizTalk server and SQL Server are separately installed to different physical servers)
Is there any idea on this scenario? Due to the boot sequence or other issues?
I will assume that, once you enable the receive locations manually, they are working correctly.
Typically, when FILE receive locations fail while pointing to an external server/share, it is because they are no longer available.
Make sure that, during the night, there are no network issues, planned/unplanned downtime of the share (= here your SQL server). A BizTalk receive location will retry to access a share for quite a while before disabling itself. Check the event log(s) for more information. You would want to look for errors/warnings there indicating an issue with connectivity between BizTalk and SQL.
Another issue might be that there are too many connections between your BizTalk server and SQL server. You can provide a maximum number of connections in the FILE share properties.
Also, you could try this link: https://serverfault.com/questions/235032/intermittent-connection-to-windows-7-shared-folder-from-windows-xp-workstations
It describes a potential fix for optimizing throughput for file sharing, although it depends on your operating system.

byon with phycal machine, SLA is global, how to ensure that the applications are not be installed on the same machine

I hava scenario like this:
I have applications A,B,C,D..., and I hava physical machines M,N,O,P,Q...
I use byon to manage physical machine, because the physial machine is "strong", so I want to deploy several application on it, so I set the SLA is global, at this time I have a question: when application A is deployed on machine M, I deploy other application B,C,D...,whether application A,B,C,D...will install on M machine only, rather than install on machine N,O,P,Q...(in this case, the host A's pressure will be very large.)
Is this problem exist, if exists, how to resolve it? thank you very much!
It's possible to limit the number of services on a specific machine by specifying the memory required for each service. As part of the global isolation SLA You can set the amount of memory required by each service, so when there isn't enough memory left on the machine - the next one will be used.
The syntax is:
isolationSLA {
global {
instanceCpuCores 0
instanceMemoryMB 128 // each instance needs 128MB allocated for it on the VM.
useManagement true // Enables installing services on the management server. Defaults to false.
}
Please note that the above code also allows services to be installed on the management machine itself, which you can set to false.
A more detailed explanation is available here, under "Isolation SLA".

LoadRunner - Monitoring linux counters gives RPC error

Linux distribution is Red Hat. I'm monitoring linux counters with the LoadRunner Controller's System Resources Graphs - Unix Resources. Monitoring is working properly and graphs are plotted in real time. But after a few minutes, errors are appearing:
Monitor name :UNIX Resources. Internal rpc error (error code:2).
Machine: 31.2.2.63. Hint: Check that RPC on this machine is up and running.
Check that rstat daemon on this machine is up and running
(use rpcinfo utility for this verification).
Details: RPC: RPC call failed.
RPC-TCP: recv()/recvfrom() failed.
RPC-TCP: Timeout reached. (entry point: Factory::CollectData).
[MsgId: MMSG-47197]
I logged on the Linux server and found rstatd is still running. Clearing the measurements in Controller's Unix Resources and adding them again, monitoring again started to work but after a few minutes, the same error occurred.
What might cause this error ? Is it due to network traffic ?
Consider using SiteScope, which has been the preferred monitoring foundation for the collection of UNIX|Linux status since version 8.0 of LoadRunner. Every Loadrunner license since version 8 has come with aa 500 Point SiteScope license in the box for this purpose. More points are available upon request for test exclusive use of the instance.

Biztalk Cluster Servers

we used to have 1 biztalk 2006R2 32bit server. We recently upgraded it to Enterprise. But because our traffic size we didn't have enough power and memory with only one. So we also recently installed a second biztalk server, a 2006R2 64-bit, and we put them in a shared cluster. Since then a problem arose, actually two but I'm guessing they probably are connected. One of our (19) host instances keeps getting in the "stopped" status. This host instance is mainly connected with TCP ports. We have a script which checks if host instances are in the stopped state and starts them again, but this obviously has very little use since it keeps resetting to the stopped state. There also is an error in our event viewer, namely:
Faulting application btsntsvc.exe, version 3.6.1404.0, stamp 4674b0a4, faulting module kernel32.dll, version 5.2.3790.4480, stamp 49c51f0a, debug? 0, fault address 0x0000bef7.
Anyone has any idea?
Thanks
Having automated scripts to restart the host instance is not a good idea IMO, you need to get to the bottom of the problem. It looks like a known issue and a hot fix is availble. Worth lookint at this KB http://support.microsoft.com/kb/978059

Slow BizTalk File Receive

I have an application with a file receive location. After the host instance has been running for a few hours the receive location fails to identify new files dropped into the folder that it is monitoring. It doesn't forget about them altogether, it's just that performance grinds to a crawl. The receive location is configured to poll the target folder every 60 seconds but after host instance has been running for an hour or so, then it seems that the target folder is being polled only every thirty minutes. If I restart the host instance then the files waiting in the target folder are collected right away and performance is fine for the next hour or so.
The same application runs fine in a different environment.
There are now obvious entries in the event log related to the problem.
All the BizTalk SQL jobs are running fine except for Backup BizTalk Server (BizTalkMgmtDb).
Any suggestions gratefully received.
Thanks
Rob
Here are some additional tools which may help you identify and diagnose BizTalk database issues.
BizTalk MsgBox Viewer
Here is a tool to repair identified errors:
Terminator
Use at your own risk... read the glogs and docs. Start with the message box viewer and let us know our results.
Without more details, the biggest tell is that your Backup Job is failing. If the backup job is failing, it may not be properly configured. If it is properly configured and still failing, then you've got other issues. Can you give us some more information about your BizTalk install.
What version are you running?
What are our database sizes?
What are your purge and archive settings like?
Is there any long running blocks in your SQL Server DB coming from BizTalk?
Another thing to consider is the user accounts the send, receive and orchestration hosts are running under. Please check the BizTalk Administration Console. If they are all running the same account, sometimes the orchestrations can starve the send and receive processes of CPU time. I believe priority is given to orchestrations then receive, then send. Even if you are just developing, it is useful to use separate accounts for this. This also improves security.
The Wrox BizTalk Server 2006 will also supply tuning advice.
What other things are going on with the server? Is BizTalk pegged otherwise or is it idle?
You mention that the solution does not have any problems in another environment, so it's likely that there is a configuration problem.
Check the following:
** On SQL Server, set some upper memory limit for SQL Server. By default, SQL Server uses whatever it can get and then hangs onto it, so set a reasonable limit so that your system can operate without spending a lot of time paging memory onto and from your hard drive(s).
** Ensure that you have available disk space - maybe you are running low - this can lead to all kinds of strange problems.
** Try to split up the system's paging file among its physical drives (if you have more than one drive on the system). Also consider using a faster drive, or if you have lots of cash laying around, get a SAN.
** In BizTalk, is tracking enabled? If so, are you also tracking message bodies? Disable tacking or message body tracking and see if there is a difference.
** Start performance monitor and monitor the following counters when running your solution
Object: BizTalk Messaging
Instance: (select the receiving host) %%
Counter: Documents Received/Sec
Object: BizTalk Messaging
Instance: (select the transmitting host) %%
Counter: Documents Sent/Sec
Object: XLANG/s Orchestrations
Instance: (select the processing host) %%
Counter: Orchestrations Completed/Sec.
%% You may have only one host, so just use it. Since BizTalk configurations vary, I am using generic names for hosts.
The preceding counters monitor the most basic aspects of your server, but may help to narrow down places to look further. You can, of course, add CPU and Memory too. If you have time (days...maybe weeks) you could monitor for processes that allocate memory and never release it. Use the following counter...
Object: Memory
Counter: Pool Nonpaged Bytes
Slow decline of this counter indicates that a process is not releasing memory, which affects everything on the system.
Let us know how things turn out!
I had the same problem with, when my orchestration was idle for some time it took a long time to process the first msg. A article of EvYoung helped me solve this problem.
"This is caused by application domain unloading within the BizTalk host process. If an AppDomain is shutdown after idle, the next message that comes needs to wait for the Orchestration to compile again. Depending on the complexity of your design, this can be a noticeable wait. To prevent this in low latency requirement scenario, you can modify the BTSNTSVC.EXE.config file and set SecondsIdleBeforeShutdown property to -1. This will prevent AppDomain shutdown due to idle."
You can find the article in here:
http://blogs.msdn.com/b/biztalkcpr/archive/2008/05/08/thoughts-on-orchestration-performance.aspx
It took me to long to respond but i thought i might help someone. cheers :)
Some good suggestions from others. I will add :
Do you have any custom receive pipeline components on the receive location ? If so perhaps one is leaking memory, calling some external component eg database which is taking a long time ?
How big are the files you are receiving ?
On the File transport properties of your receive location, set "file renaming" on, do the files get renamed within 60s.

Resources