dotnet core - Server hangs on Production

dotnet core - Server hangs on Production - nginx

We are currently experiencing an issue when we run our dotnet core server setup on Production. We publish it in Bamboo and run it from an AWS linux server, and it sits behind an nginx reverse proxy.
Essentially, every few days our dotnet core server process will go mute. It silently accepts and hangs on web requests, and even silently ignores our (more polite) attempts to stop it. We have verified that it is actually the netcore process that hangs by sending curl requests directly to port 5000 from within the server. We've replicated our production deployment to the best of our ability to our test environment and have not been able to reproduce this failure mode.
We've monitored the server with NewRelic and have inspected it at times when it's gone into failure mode. We've not been able to correlate this behaviour with any significant level of traffic, RAM usage, CPU usage, or open file descriptor usage. Indeed, these measurements all seem to stay at very reasonable levels.
My team and I are a bit stuck as to what might be causing our hung server, or even what we can do next to diagnose it. What might be causing our server process to hang? What further steps can we take to diagnose the issue?
Extra Information
Our nginx conf template:
upstream wfe {
server 127.0.0.1:5000;
server 127.0.0.1:5001;
}
server {
listen 80 default_server;
location / {
proxy_set_header Host $http_host;
proxy_pass http://wfe;
proxy_read_timeout 20s;
# Attempting a fix suggested by:
# https://medium.com/#mshanak/soved-dotnet-core-too-many-open-files-in-system-when-using-postgress-with-entity-framework-c6e30eeff6d1
proxy_buffering off;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection keep-alive;
proxy_cache_bypass $http_upgrade;
fastcgi_buffers 16 16k;
fastcgi_buffer_size 32k;
}
}
Our Program.cs:
using System.Diagnostics.CodeAnalysis;
using System.IO;
using System.Net;
using Microsoft.AspNetCore;
using Microsoft.AspNetCore.Hosting;
using Microsoft.Extensions.Logging;
using Serilog;
namespace MyApplication.Presentation
{
[ExcludeFromCodeCoverage]
public class Program
{
public static void Main(string[] args)
{
IWebHost host = WebHost.CreateDefaultBuilder(args)
#if DEBUG
.UseKestrel(options => options.Listen(IPAddress.Any, 5000))
#endif
.UseStartup<Startup>()
.UseSerilog()
.Build();
host.Run();
}
}
}
During our CD build process, we publish our application for deployment with:
dotnet publish --self-contained -c Release -r linux-x64
We then deploy the folder bin/Release/netcoreapp2.0/linux-x64 to our server, and run publish/<our-executable-name> from within.
EDIT: dotnet --version outputs 2.1.4, both on our CI platform and on the production server.
When the outage starts, nginx logs show that server responses to requests change from 200 to 502, with a single 504 being emitted at the time of the outage.
At the same time, logs from our server process just stop. And there are warnings there, but they're all explicit warnings that we've put into our application code. None of them indicate that any exceptions have been thrown.

After a few days of investigation I've found the reason of that issue. It is being caused by glibc >= 2.27, which lead to GC hang at some conditions, so there is almost nothing to do about it. However you have a few options:
Use Alpine Linux. It doesn't rely on glibc.
Use older distro like Debian 9, Ubuntu 16.04 or any other with glibc < 2.27
Try to patch glibc by yourself at your own risk: https://sourceware.org/bugzilla/show_bug.cgi?id=25847
Or wait for the glibc patch to be reviewed by community and included in your favorite distro.
More information can be found here: https://github.com/dotnet/runtime/issues/47700

Related

Accessing GraphDB Workbench throught the internet (not localhost) in a Nginx server

I have GraphDb (standlone server version) in Ubuntu Server 16 running in localhost (with the command ./graphdb -d in /etc/graphdb/bin). But I only have ssh access to the server in the terminal, can't open Worbench in localhost:7200 locally.
I have many websites running on this machine with Ningx. If I try to access the machine's main IP with the port 7200 through the external web it doesn't work (e.g. http://193.133.16.72:7200/ = "connection timmed out").
I have tried to make a reverse proxy with Nginx with this code ("xxx" = domain):
listen 7200;
listen [::]:7200;
server_name sparql.xxx.com;
location / {
proxy_pass http://127.0.0.1:7200;
}
}
But all this fails. I checked and port 7200 is open in firewall (ufw). In the logs I get info that GraphDB is working locally in some testes. But I need Workbench access to import and create repositories (not sure how to do it or if it is possible without the Workbench GUI).
Is there a way to connect through the external web to the Workbench using a domain/IP and/or Nginx?
Read all the documentation and searched all day, but could not find a way to deal with this sorry. I only used GraphDB locally (the simple installer version), never used the standalone server in production before, sorry.
PS: two extra questions related:
a) For creating an URI endpoint it is the same procedure?
b) For GraphDB daemon to start automattically at boot time (with the command ./graphdb -d in graph/bin folder), what is the recommended way and configuration? (tryed the line "/etc/graphdb/bin ./graphdb -d" in rc.local but it didn't worked).

In case it is usefull for someone, I was able to make it work with this Nginx configuration:
server {
listen 80;
server_name sparql.xxxxxx.com;
location / {
proxy_pass http://localhost:7200;
proxy_set_header Host $host;
}
}
I think it was the "proxy_set_header Host $host;" that solved it (tried the rest without it before and didn't worked). I think GraphDB uses some headers to set configurations, and they were not passing.
I wounder if I am forgetting something else important to forward in the proxy, but in this moment the Worbench seams to work and opens in the domain used "sparql.xxxxxx.com".

Rails 5 Action Cable deployment with Nginx, Puma & Redis

I am trying to deploy an Action Cable -enabled-application to a VPS using Capistrano. I am using Puma, Nginx, and Redis (for Cable). After a couple hurdles, I was able to get it working in a local developement environment. I'm using the default in-process /cable URL. But, when I try deploying it to the VPS, I keep getting these two errors in the JS-log:
Establishing connection to host ws://{server-ip}/cable failed.
Connection to host ws://{server-ip}/cable was interrupted while loading the page.
And in my app-specific nginx.error.log I'm getting these messages:
2016/03/10 16:40:34 [info] 14473#0: *22 client 90.27.197.34 closed keepalive connection
Turning on ActionCable.startDebugging() in the JS-prompt shows nothing of interest. Just ConnectionMonitor trying to reopen the connection indefinitely. I'm also getting a load of 301: Moved permanently -requests for /cable in my network monitor.
Things I've tried:
Using the async adapter instead of Redis. (This is what is used in the developement env)
Adding something like this to my /etc/nginx/sites-enabled/{app-name}:
location /cable/ {
proxy_pass http://puma;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "Upgrade";
}
Setting Rails.application.config.action_cable.allowed_request_origins to the proper host (tried "http://{server-ip}" and "ws://{server-ip}")
Turning on Rails.application.config.action_cable.disable_request_forgery_protection
No luck. What is causing the issue?
$ rails -v
Rails 5.0.0.beta3
Please inform me of any additional details that may be useful.

Finally, I got it working! I've been trying various things for about a week...
The 301-redirects were caused by nginx actually trying to redirect the browser to /cable/ instead of /cable. This is because I had specified /cable/ instead of /cable in the location stanza! I got the idea from this answer.

NGINX configuration for Rails 5 ActionCable with puma

I am using Jelastic for my development environment (not yet in production).
My application is running with Unicorn but I discovered websockets with ActionCable and integrated it in my application.
Everything is working fine in local, but when deploying to my Jelastic environment (with the default NGINX/Unicorn configuration), I am getting this message in my javascript console and I see nothing in my access log
WebSocket connection to 'ws://dev.myapp.com:8080/' failed: WebSocket is closed before the connection is established.
I used to have on my local environment and I solved it by adding the needed ActionCable.server.config.allowed_request_origins in my config file. So I double-checked my development config for this and it is ok.
That's why I was wondering if there is something specific for NGINX config, else than what is explained on ActionCable git page
bundle exec puma -p 28080 cable/config.ru
For my application, I followed everything from enter link description here but nothing's mentioned about NGINX configuration
I know that websocket with ActionCable is quite new but I hope someone would be able to give me a lead on that
Many thanks

Ok so I finally managed to fix my issue. Here are the different steps which allowed to make this work:
1.nginx : I don't really know if this is needed but as my application is running with Unicorn, I added this into my nginx conf
upstream websocket {
server 127.0.0.1:28080;
}
server {
location /cable/ {
proxy_pass http://websocket/;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "Upgrade";
}
}
And then in my config/environments/development.rb file:
config.action_cable.url = "ws://my.app.com/cable/"
2.Allowed request origin: I have then noticed that my connection was refused even if I was using ActionCable.server.config.allowed_request_origins in my config/environments/development.rb file. I am wondering if this is not due to the development default as http://localhost:3000 as stated in the documentation. So I have added this:
ActionCable.server.config.disable_request_forgery_protection = true
I have not yet a production environment so I am not yet able to test how it will be.
3.Redis password: as stated in the documentation, I was using a config/redis/cable.yml but I was having this error:
Error raised inside the event loop: Replies out of sync: #<RuntimeError: ERR operation not permitted>
/var/www/webroot/ROOT/public/shared/bundle/ruby/2.2.0/gems/em-hiredis-0.3.0/lib/em-hiredis/base_client.rb:130:in `block in connect'
So I understood the way I was setting my password for my redis server was not good.
In fact your have to do something like this:
development:
<<: *local
:url: redis://user:password#my.redis.com:6379
:host: my.redis.com
:port: 6379
And now everything is working fine and Actioncable is really impressive.
Maybe some of my issues were trivial but I am sharing them and how I resolved them so everyone can pick something if needed

Deploying Meteor app via Meteor Up or tmux meteor

I'm a bit curious if Meteor Up (or other Meteor app deploying processes like Modulus) do anything fancy compared to copying over your Meteor application, starting a tmux session, and just running meteor to start your application on your server. Thanks ins advance!

Meteor Up and Modulus seem to just run node.js and Mongodb. They run your app after it has been packaged for production with meteor build. This will probably give your app an edge in performance.
It is possible to just run meteor in a tmux or screen session. I use meteor run --settings settings.json --production to pass settings and also use production mode which minifies the code etc. You can also use a proxy forwarder like Nginx to forward requests to port 80 (http) and 443 (https).
For reference here's my Nginx config:
server {
listen 80;
server_name example.com www.example.com;
return 301 https://example.com$request_uri;
}
server {
listen 443 ssl;
server_name www.example.com;
ssl_certificate /etc/ssl/private/example.com.unified.crt;
ssl_certificate_key /etc/ssl/private/example.com.ssl.key;
return 301 https://example.com$request_uri;
}
server {
listen 443 ssl;
server_name example.com;
ssl_certificate /etc/ssl/private/example.com.unified.crt;
ssl_certificate_key /etc/ssl/private/example.com.ssl.key;
location / {
proxy_pass http://localhost:3000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_set_header X-NginX-Proxy true;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}
By using this method everything is contained within the meteor container and you have the benefit of meteor watching for changes etc. However, there may be some extra overhead on your server. I'm not sure exactly how much as I haven't tested both ways enough.
The only problem I've found by using this method is it's not easy to get everything automated on reboot, like automatically running tmux then launching meteor, as opposed to using specially designed tools like Node.js Forever or PM2 which automatically start when the server is rebooted. So you have to manually log in to the server and run meteor. If you work out an easy way to do this using tmux or screen let me know.
Edit:
I have managed to get Meteor to start on system boot with the following line in the /etc/rc.local file:
sudo -H -u ubuntu -i /usr/bin/tmux new-session -d '/home/ubuntu/Sites/meteorapp/run_meteorapp.sh'
This command runs the run_meteorapp.sh shell script inside a tmux session once the system has booted. In the run_meteorapp.sh I have:
#!/usr/bin/env bash
(cd /home/ubuntu/Sites/meteorapp && exec meteor run --settings settings.json --production)

If you look at the Meteor Up Github page: https://github.com/arunoda/meteor-up you can see what it does.
Such as:
Features
Single command server setup Single command deployment Multi server
deployment Environmental Variables management Support for
settings.json Password or Private Key(pem) based server authentication
Access, logs from the terminal (supports log tailing) Support for
multiple meteor deployments (experimental)
Server Configuration
Auto-Restart if the app crashed (using forever) Auto-Start after the
server reboot (using upstart) Stepdown User Privileges Revert to the
previous version, if the deployment failed Secured MongoDB
Installation (Optional) Pre-Installed PhantomJS (Optional)
So yes... it does a lot more...

Mupx does even more. It takes advantage of docker. It is the development version but I have found it to be more reliable than mup after updating Meteor to 1.2
More info can be found at the github repo: https://github.com/arunoda/meteor-up/tree/mupx

I have been using mupx to deploy to digital ocean. Once you set up the mup.json file you can not only deploy the app, but you can also update the code on the server easily through the CLI. There are a few other commands too that are helpful.
mupx reconfig - reconfigs app with environment variables
mupx stop - stops app duh
mupx start - ...
mupx restart - ...
mupx logs [-f --tail=100] - this gets logs which can be hugely helpful when you encounter deployment errors.
It certainly makes it easy to update your app, and I have been pretty happy with it.
Mupx does use MeteorD (Docker Runtime for Meteor Apps)
and since it uses docker it can be really useful to access the MongoDB shell via ssh with this command:
docker exec -it mongodb mongo <appName>
Give it a shot!

An explanation of the nginx/starman/dancer web stack

I've been doing web-programming for a while now and am quite familiar with the LAMP stack. I've decided to try playing around with the nginx/starman/dancer stack and I'm a bit confused about how to understand, from a high-level, how all the pieces relate to each other. Setting up the stack doesn't seem as straight forward as setting up the LAMP stack, but that's probably because I don't really understand how the pieces relate.
I understand the role nginx is playing - a lightweight webserver/proxy - but I'm confused about how starman relates to pgsi, plack and dancer.
I would appreciate a high-level breakdown of how these pieces relate to each other and why each is necessary (or not necessary) to get the stack setup. Thanks!

I've spent the last day reading about the various components and I think I have enough of an understanding to answer my own question. Most of my answer can be found in various places on the web, but hopefully there will be some value to putting all the pieces in one place:
Nginx: The first and most obvious piece of the stack to understand is nginx. Nginx is a lightweight webserver that can act as a replacement for the ubiquitous Apache webserver. Nginx can also act as a proxy server. It has been growing rapidly in its use and currently serves about 10% of all web domains. One crucial advantage of nginx is that it is asynchronous and event-driven instead of creating a process thread to handle each connection. In theory this means that nginx is able to handle a large number of connections without using a lot of system resources.
PSGI: PSGI is a protocol (to distinguish it from a particular implementation of the protocol, such as Plack). The main motivation for creating PSGI, as far as I can gather, is that when Apache was first created there was no native support for handling requests with scripts written in e.g., Perl. The ability to do this was tacked on to Apache using mod_cgi. To test your Perl application, you would have to run the entire webserver, as the application ran within the webserver. In contrast, PSGI provides a protocol with which a webserver can communicate with a server written in e.g. Perl. One of the benefits of this is that it's much easier to test the Perl server independently of the webserver. Another benefit is that once an application server is built, it's very easy to switch in different PSGI-compatible webservers to test which provides the best performance.
Plack: This is a particular implementation of the PSGI protocol that provides the glue between a PSGI-compatible webserver and a perl application server. Plack is Perl's equivalent of Ruby's Rack.
Starman: A perl based webserver that is compatible with the PSGI protocol. One confusion I had was why I would want to use both Starman and Nginx at the same time, but thankfully that question was answered quite well here on Stackoverflow. The essence is that it might be better to let nginx serve static files without requiring a perl process to do that, while also allowing the perl application server to run on a higher port.
Dancer: A web application framework for Perl. Kind of an equivalent of Ruby on Rails. Or to be more precise, an equivalent of Sinatra for Ruby (the difference is that Sinatra is a minimalist framework, whereas Ruby on Rails is a more comprehensive web framework). As someone who dealt with PHP and hadn't really used a web framework before, I was a bit confused about how this related to the serving stack. The point of web frameworks is they abstract away common tasks that are very frequently performed in web applications, such as converting database queries into objects/data structures in the web application.
Installation (on ubuntu):
sudo apt-get install nginx
sudo apt-get install build-essential curl
sudo cpan App::cpanminus
sudo cpanm Starman
sudo cpanm Task::Plack
sudo apt-get install libdancer-perl
Getting it running:
cd
dancer -a mywebapp
sudo plackup -s Starman -p 5001 -E deployment --workers=10 -a mywebapp/bin/app.pl
Now you will have a starman server running your Dancer application on port 5001. To make nginx send traffic to the server you have to modify /etc/nginx/nginx.conf and add a rule something like this to the http section:
server {
server_name permanentinvesting.com
listen 80;
location /css/ {
alias /home/ubuntu/mywebapp/public/css/;
expires 30d;
access_log off;
}
location / {
proxy_pass http://localhost:5001;
proxy_set_header X-Real-IP $remote_addr;
}
}
The first location rule specifies that nginx should handle static content in the /css directory by getting it from /home/ubuntu/mywebapp/public/css/. The second location rule says that traffic to the webserver on port 80 should be sent to the Starman server to handle. Now we just need to start nginx:
sudo service nginx start

Your Answer is so far correct, but it would be better to set up nginx the following way:
server {
listen 80;
server_name foo.example.com;
location / {
# Serve static files directly:
if (-f $request_filename) {
expires 30d;
break;
}
# Pass on other requests to Dancer app
proxy_pass_header Server;
proxy_pass http://localhost:5001/;
}
}
This make nginx serve all static files (JavaScript and images) and not just the css.
This example is taken from the 2011 Perl Dancer Advent :)

From nginx wiki:
"IfIsEvil ... Directive if has problems when used in location context, in some cases it doesn't do what you expect but something completely different instead. In some cases it even segfaults. It's generally a good idea to avoid it if possible...."
A better set up is:
server {
listen 80;
server_name foo.example.com;
location / {
# Checks the existence of files and uses the first match
try_files $uri $uri/ #dancer;
}
location #dancer {
# Pass on other requests to Dancer app
proxy_pass_header Server;
proxy_pass http://localhost:5001/;
}
}

Correction for the answer from s.magri:
location #dancer {
# Pass on other requests to Dancer app
proxy_pass_header Server;
proxy_pass http://localhost:5001;
}
I had to remove the trailing slash in the last proxy_pass directive. My version of nginx (1.10.3) won't start up with the trailing slash.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex