Timeout waiting for response from storage host - apache-cloudstack

We are evaluating moving from ESXi standalone hosts to Cloudstack. I've used the documentation here http://cloudstack-installation.readthedocs.org/en/latest/qig.html to do a single machine setup on a fresh instance of CentOS 6.6.
Installation is complete, however I cannot add any ISO templates. When I add any version of CentOS 6 (tried 6.5 and 6.6 minimal) I get the cryptic "Timeout waiting for response from storage host.". I tested the two NFS shares that are setup in the install and they can both be mounted with no issue.
Are there logs that might give me more information as to what is going on? From going through /var/logs/cloudstack all I'm getting are a pile tomcat and java logs with nothing obviously referencing what I'm trying to do.

I found the problem. Apparently their firewall configuration is just wrong (I was iffy on it originally but decided to stick with it). Stopping iptables allows the image to be uploaded, I assume restarting the services will re-add the firewall rules we DO need.

Related

Cosmos DB Emulator won't start on Windows 10

I’m recieving this error when I try to start Cosmos Emulator 2.7.2.0:
Tried it on a fresh win10 machine, and it's working here. Any ideas on what might course this?
I tried all the suggestions in this article.
Looking at the etl file didn't give my anything either.
It's the first time I try to install it on my machine.
-- UPDATE --
I opened the dmp file, and found this:
Loading Dump File [C:\Users\xxx\AppData\Local\CrashDumps\Microsoft.Azure.Cosmos.StartupEntryPoint.exe(1).12596.dmp]
User Mini Dump File: Only registers, stack and portions of memory are available
Symbol search path is: srv*
Executable search path is:
Windows 10 Version 18363 MP (12 procs) Free x64
Product: WinNt, suite: SingleUserTS
18362.1.amd64fre.19h1_release.190318-1202
Machine Name:
Debug session time: Mon Jan 20 09:35:26.000 2020 (UTC + 1:00)
System Uptime: not available
Process Uptime: 0 days 0:00:02.000
................................................................
..................................
This dump file has an exception of interest stored in it.
The stored exception information can be accessed via .ecxr.
(3134.41b0): Unknown exception - code c000000d (first/second chance not available)
For analysis of this file, run !analyze -v
ntdll!NtWaitForMultipleObjects+0x14:
00007ff9`562fcc14 c3 ret
0:000> .ecxr
rax=0000000000000000 rbx=0000000000000000 rcx=0000003629dfd270
rdx=00000000e6bc7d4e rsi=0000000000000000 rdi=0000000000000000
rip=000000006748b0ec rsp=0000003629dfd190 rbp=0000000000000000
r8=000001ba53134730 r9=0000000000000000 r10=0000000000000026
r11=000001ba6d7662e0 r12=0000000000000004 r13=000001ba6d7974d0
r14=0000000000000000 r15=0000000000000000
iopl=0 nv up ei pl zr na po nc
cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246
msvcr80!_invalid_parameter+0x6c:
00000000`6748b0ec 488d4c2440 lea rcx,[rsp+40h]
Seems like something is wrong with my version of msvcr80
This may be caused by corrupted performance counters on your machine.
To fix your performance counters try the following.
Open cmd as administrator
Run "lodctr /R" (must use capital R)
If this doesn't work, see this link here that shows other options to reset your counters. Older article but works the same on Windows 10.
https://answers.microsoft.com/en-us/insider/forum/insider_wintp-insider_perf/possible-performance-counter-problem-on-win10/3a5c22cb-1425-4d26-99e7-4ec46940b9a1
btw, one more option here, is to run the emulator in a Docker container. The docs on that are in that article you referenced in your question.
Hope this is helpful.
For me this confusing error message simply meant that I needed to use another port! Or fix problems with the default 8081 one (read update 2 way below).
Here's the text of the error so that it's searchable:
Error: multiple attempts to restart one of the Azure Cosmos Emulator
processes were detected. The emulator will shutdown. If the problem
persist please try uninstalling the Azure Cosmos Emulator, remove the
CosmosDBEmulator directory from your %%LOCALAPPDATA%% and reinstall.
You can also reach out to the Azure Cosmos team using the 'Feedback'
link in the Data Explorer browser window.
Details:
In my case this happened by itself after Windows 10 updated 2004. I swear something breaks in Cosmos DB emulator every feature update!
For me there were no crash dumps and rebuilding performance counters didn't help. In the Event Viewer for System I've noticed this error A LOT:
Unable to bind to the underlying transport for 127.0.0.1:8081. The IP Listen-Only list may contain a reference to an interface which may not exist on this machine. The data field contains the error number.
No solutions for this error helped, so I first used netstat -ao to see if any program is already listening at 8081. Nope.
Then I used the "Port listener" utility from the www, and it turned out it can't listen to 8081 either, not even under admin rights!
Since I couldn't fix this 8081 problem at all & turning off Windows Firewall didn't help, I simply started the Cosmos DB emulator at a different port and it worked!
C:\Program Files\Azure Cosmos DB Emulator>Microsoft.Azure.Cosmos.Emulator.exe /Port=18001
If you've configured cosmos db to be started automatically on system startup, make sure to update that script as well.
In my case this cmdline got me into the Data explorer, but it wasn't actually working. The "Explorer" part of it was "fetching offers" infinitely. I had to again uninstall Cosmos DB emulator, clean this folder: %LOCALAPPDATA%\CosmosDBEmulator, reboot, install again without starting at 8081, and only start at the new port.
Cosmos DB devs, you're awesome people, but this is too much hassle. And if you can't use the port just say so, you don't have to be cryptic!
UPDATE:
This worked, but I forgot to update the start up script to use the new port, and noticed that after one more reboot the cosmos db started on 8081 without issues. What??! But it wasn't just any old reboot - I already did several of them before and they didn't help. This reboot may have been special in that it came after Windows found one more update (kb4576478), probably realizing it's now needed since I'm at version 2004, installed it, and THAT reboot (or the kb4576478 update specifically) fixed the 8081 problem, whatever it was. Yikes!
UPDATE 2:
Had another port-related problem with another tool. Ugh. Found this solution (see "General workaround") https://superuser.com/a/1568476/251491 This page's accepted answer also mentions the magic of multiple reboots.
Finally got it working!
By using WinDbg and opening the dmp files %LOCALAPPDATA%\CrashDumps, I found out that a dll called perf-MSSQL$SQLEXPRESS-sqlctr10.1.2531.0.dll was the root cause.
By deleting this in the %WINDIR%\System32 the error was "fixed" and I can now run the emulator.

Symfony / Sylius site on Vagrant / Puphpet is slow. Same site not on a Virtualbox is not slow

We have one particular site that is Symfony and uses the e-commerce bundle Sylius.
Our developers are trying to use Vagrant so we can have similar dev environments. We use Puphpet to generate the Vagrant instance and share the config file.
If we are working on the site/repo natively or on a staging server, all runs fine. Pages load in around 2-3 seconds.
When we are using Vagrant / Virtualbox, it's 30-35 seconds per page load.
So far we've tried
Allocating up to 6GB to the box
Giving up to 4 processors to the box
Turning on NFS for file sync
Turning off all other programs on computers running Vagrant / Virtualbox (chat, other browsers, etc)
None of those things made an impact on page load time.
I can provide 2 things. One is the load trace from Symfony: https://nimbus.everhelper.me/client/notes/share/708707/mvw707mckzm2wq4rlkzc
Since there is so much code to the puphpet config, I put it in a pastebin here: http://pastebin.com/7ciVA5FL
What is OS on a host machine?
My guess would be that file system is slow. Try to run an app outside of shared folder on the guest machine. If it will be fast, then you'll spot a problem at least.
NFS on *nix or mac should be fast enough, are you sure you've succeed to turn it on?
I had this pain once, and finally started to use unison instead of native vagrant's file sharing system (https://www.cis.upenn.edu/~bcpierce/unison/)
Have your tried:
http://www.whitewashing.de/2013/08/19/speedup_symfony2_on_vagrant_boxes.html
or http://jeremybarthe.com/2015/02/02/speed-up-vagrant-environment-symfony2/
I think the first one is already included in Sylius, but not sure.
Also, dynamic image resize/crop may be reading/writing in the host file system and maybe there's a way to also change that (using symlinks or similar)?
vagrant-winnfsd works fine for me for getting NFS to work on Windows.

MySQL keeps running out of memory with Wordpress, how much memory do I need?

I have been experiencing MySQL crashing recently and really need to figure out what I need to do to get this to stop.
I have a 2GB Digital Ocean server running the following:
Ubuntu 14.04
PHP v5.5.9
Apache v20120211
MySQL v5.5.43
Wordpress v4.2
I also have 2GB of swap.
The last time MySQL crashed this was in my error log
http://laravel.io/bin/E304E
The important part seems (to me) to be this
InnoDB: Fatal error: cannot allocate memory for the buffer pool
I am getting about 2000 page views per day. I thought this should be easily enough memory to run the site.
Can anyone give me some ideas what I can do or what I definitely need to do to stop this happening?
Thanks
2000 page views per day is well within the range of what your server can handle. It's possible you're getting hit by bots and/or Apache isn't configured well for your server size.
Apache2Buddy is a quick diagnostic tool to help with your Apache configurations. $ curl -L http://apache2buddy.pl/ | perl. It'll print out a report with suggested configuration adjustments given your RAM available and application size. My guess is that you'll need to update MaxRequestWorkers (located at /etc/apache2/mods-available/mpm_prefork.conf) to something smaller.
I'm also guessing that you have bots hitting your site, which is causing the huge volume of traffic that is crashing Apache. Check your access logs $ cat /var/log/apache2/access.log.
I wrote an article on this situation if you want a deeper explanation, a method to stress test, or ideas on how to block some of the bot traffic: http://brunzino.github.io/blog/2016/05/21/solution-how-to-debug-intermittent-error-establishing-database-connection/

java.net.BindException: No free port within range in Glassfish 3.1

Today I have deployed an app to our production application server GlassfishV3 through Jenkins CI to the autodeploy folder. The app server went down, and I cannot bring it back up.
My goal is to have the server up and running the same as prior to deploying the application. This is what I have done:
First find the PID of the process running at port 4848: nestat -nlept
Then kill the PID by doing kill -9 PID
Remove the war file that Jenkinks just put in the autodeploy directory just in case if that is the problem.
Start the server again by doing ./asadmin start-domain domain1
The server takes FOREVER to start !!! In fact it never starts successfully as I cannot access the admin console at 4848 or any of the other apps that were already running. However, it leaves a process running at 4848.
I looked at the jvm.log and server.log and I found a java.net.BindException:No free port within range.........
So my questions are as follows:
Do you know what is going on?
Do you know how to fix it?
Do you know of a way to speed up the ./asadmin start-domain domain1 process?
Note: In our QA app server (Same version, same OS, Same Java, Same Grails) it does not happen. Really frustrated with this issue.
Thanks a lot for your help. Any help would be very much appreciated as this is a production issue that has several applications down for a few hours already.
Dario
I can deploy my application now, basically it boiled down to increasing the MaxPermSize jvm option
Under the config folder, edit domain.xml and change the default size to this:
-XX:MaxPermSize=256m
You can always increase it as necessary.
Also, if that is not enough you can also change the max heap size in that same file
-Xmx512m . I have left it as is but if required you can change that to 6g or more on a 64 bit OS. On a 32 bit OS it will only recognize up to 3.5g.
Hope this helps somebody else in the future, as this issue kept me at work until 9:00PM
UPDATE:
I had peformance issues again and I found this other solution in Joshi's tech blog:
http://joshitech.blogspot.com/2009/09/glassfish-application-server.html
Basically add the following jvm options in the domain.xml. It should increase Glassfish boot up and deployment performance:
<jvm-options>-server</jvm-options>
<jvm-options>-Xms3000m</jvm-options>
<jvm-options>-Xmx3000m</jvm-options>
<jvm-options>-XX:MaxPermSize=192m</jvm-options>
<jvm-options>-XX:NewRatio=2</jvm-options>
<jvm-options>-XX:+AggressiveHeap</jvm-options>
<jvm-options>-XX:+AggressiveOpts</jvm-options>
<jvm-options>-XX:+UseParallelGC</jvm-options>
<jvm-options>-XX:+UseParallelOldGC</jvm-options>
<jvm-options>-XX:ParallelGCThreads=5</jvm-options>

Glassfish admin console slow loading

Today I stopped/started my GlassfishV3 instance and now I cannot access the addmin console located at http://servername:4848/. The screen says: "The admin console is loading..." This is going on forever now.
I have tried as follows:
I have tried adding the following entry to my domain.xml located at /glassfishv3/glassfish/domains/domain1/config as suggested in another Stack Overflow Q&A but after restarting the server still no luck.
<java-options>-Dcom.sun.enterprise.tools.admingui.NO_NETWORK=true</java-options>
I have also installed glassfishv3 on my local machine and cannot recreate the problem.I can go to http://localhost:4848 without any problem.
I have also looked at the server.log and jvm.log files located under the /glassfishv3/glassfish/domains/domain1/logs and nothing there that shed some light.
Any help would be very much appreciated
I had similar symptoms, and I tried some of what Dario had suggested as well, but it didn't work. It could be that I had a unique configuration for my dev env: I'm running Glassfish 3.1 on a VirtualBox Ubuntu 11.04 64-bit guest on a Windows 7 64-bit host. Quite by accident, I discovered an additional symptom: if I turned off the network on the Ubuntu guest, the console would load successfully on a localhost browser instance. That is, on the Ubuntu guest with the network off, I could successfully navigate to http://localhost:4848 and show the Glassfish admin console as expected. However, if the Ubuntu guest's network was on, I had the exact behavior suggested by the original poster: http://localhost:4848 would just sit forever on the inial loading page.
To make a long story short, I found that adding the following argument to the JVM options for server-config fixed the problem:
-Djava.net.preferIPv4Stack=true
When I made that change and restarted the Glassfish server, everything worked.
(Note that I also had in place some of the other settings recommended above, i.e., NO_NETWORK=true, and I'd adjusted the JVM memory footprint and set it to -server instead of -client. It could be that these settings are required as well, though they weren't sufficient on their own in my case.)
I was having this exact same problem. I could deploy in run mode, but it would hang forever in Debug mode. IntelliJ was hanging on the breakpoints. I muted the breakpoints, and glassfish3 worked good as new. I didn't have to change any domain.xml settings. Check your breakpoints!
I found a solution to my problem. Setting the java-option to NO_NETWORK to true did not work so I upgraded from 3.0.1 to 3.1 and it got fixed. Not immediately though, I had to stop/start the Glassfish server a couple of times before I got into the admin console without any really long delays.
Solution
The solution was to upgrade from the command line using the pkg utility.
You can find the steps in this link:
http://download.oracle.com/docs/cd/E18930_01/html/821-2437/gkthu.html#gktjf
Or do as follows:
Go to as-install-parent/bin
./pkg image-update
as-install-parent/glassfish/bin/asadmin start-domain --upgrade domain-name
as-install-parent/glassfish/bin/asadmin start-domain domain-name
UPDATE
I had peformance issues again and I found this other solution in Joshi's tech blog:
http://joshitech.blogspot.com/2009/09/glassfish-application-server.html
Basically add the following jvm options in the domain.xml. It should increase Glassfish boot up and deployment performance:
<jvm-options>-server</jvm-options>
<jvm-options>-Xms3000m</jvm-options>
<jvm-options>-Xmx3000m</jvm-options>
<jvm-options>-XX:MaxPermSize=192m</jvm-options>
<jvm-options>-XX:NewRatio=2</jvm-options>
<jvm-options>-XX:+AggressiveHeap</jvm-options>
<jvm-options>-XX:+AggressiveOpts</jvm-options>
<jvm-options>-XX:+UseParallelGC</jvm-options>
<jvm-options>-XX:+UseParallelOldGC</jvm-options>
<jvm-options>-XX:ParallelGCThreads=5</jvm-options>
I don't know if you are referencing this answer, but there is a second step described (disabling update module).
Two more ideas:
Check if the NO_NETWORK=true option really works (there should be no ads in GF admin console)
Watch the server.log (glassfish-install-dir/glassfis/domains/domain1/logs) during startup and look for the last log entry before the delay occurs. This could be a hint for the source of the delay.
Beware of blindly following Dario's example unless you've lots more RAM than most do.
-Xms3000m gives 3gb to Glassfish. Do YOU have that much spare RAM?
I tried this on my 4gb Mac with 1gb for Glassfish. Made no discernable difference at all...performance still sux.

Resources