How to disable interception of errors by Ingress in a Tectonic kubernetes setup - nginx

I have a couple of NodeJS backends running as pods in a Kubernetes setup, with Ingress-managed nginx over it.
These backends are API servers, and can return 400, 404, or 500 responses during normal operations. These responses would provide meaningful data to the client; besides the status code, the response has a JSON-serialized structure in the body informing about the error cause or suggesting a solution.
However, Ingress will intercept these error responses, and return an error page. Thus the client does not receive the information that the service has tried to provide.
There's a closed ticket in the kubernetes-contrib repository suggesting that it is now possible to turn off error interception: https://github.com/kubernetes/contrib/issues/897. Being new to kubernetes/ingress, I cannot figure out how to apply this configuration in my situation.
For reference, this is the output of kubectl get ingress <ingress-name>: (redacted names and IPs)
Name: ingress-name-redacted
Namespace: default
Address: 127.0.0.1
Default backend: default-http-backend:80 (<none>)
Rules:
Host Path Backends
---- ---- --------
public.service.example.com
/ service-name:80 (<none>)
Annotations:
rewrite-target: /
service-upstream: true
use-port-in-redirects: true
Events: <none>

I have solved this on Tectonic 1.7.9-tectonic.4.
In the Tectonic web UI, go to Workloads -> Config Maps and filter by namespace tectonic-system.
In the config maps shown, you should see one named "tectonic-custom-error".
Open it and go to the YAML editor.
In the data field you should have an entry like this:
custom-http-errors: '404, 500, 502, 503'
which configures which HTTP responses will be captured and be shown with the custom Tectonic error page.
If you don't want some of those, just remove them, or clear them all.
It should take effect as soon as you save the updated config map.
Of course, you could to the same from the command line with kubectl edit:
$> kubectl edit cm tectonic-custom-error --namespace=tectonic-system
Hope this helps :)

Related

Enforce a domain pattern that a service can use

I have a multi-tenant Kubernetes cluster. On it I have an nginx reverse proxy with load balancer and the domain *.example.com points to its IP.
Now, several namespaces are essentially grouped together as project A and project B (according to the different users).
How, can I ensure that any service in a namespace with label project=a, can have any domain like my-service.project-a.example.com, but not something like my-service.project-b.example.com or my-service.example.com? Please keep in mind, that I use NetworkPolicies to isolate the communication between the different projects, though communication with the nginx namespace and the reverse proxy is always possible.
Any ideas would be very welcome.
EDIT:
I made some progress as have been deploying Gatekeeper to my GKE clusters via Helm charts. Then I was trying to ensure that only Ingress hosts of the form ".project-name.example.com" should be allowed. For this, I have different namespaces that each have labels "project=a" or similar and each of these should only allow to use ingress of the form ".a.example.com". Hence I need that project label information for the respective namespaces. I wanted to deploy the following resources
apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
name: k8srequiredingress
spec:
crd:
spec:
names:
kind: K8sRequiredIngress
validation:
# Schema for the `parameters` field
openAPIV3Schema:
type: object
properties:
labels:
type: array
items:
type: string
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package k8srequiredingress
operations := {"CREATE", "UPDATE"}
ns := input.review.object.metadata.namespace
violation[{"msg": msg, "details": {"missing_labels": missing}}] {
input.request.kind.kind == "Ingress"
not data.kubernetes.namespaces[ns].labels.project
msg := sprintf("Ingress denied as namespace '%v' is missing 'project' label", [ns])
}
violation[{"msg": msg, "details": {"missing_labels": missing}}] {
input.request.kind.kind == "Ingress"
operations[input.request.operation]
host := input.request.object.spec.rules[_].host
project := data.kubernetes.namespaces[ns].labels.project
not fqdn_matches(host, project)
msg := sprintf("invalid ingress host %v, has to be of the form *.%v.example.com", [host, project])
}
fqdn_matches(str, pattern) {
str_parts := split(str, ".")
count(str_parts) == 4
str_parts[1] == pattern
}
---
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredIngress
metadata:
name: ns-must-have-gk
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Ingress"]
---
apiVersion: config.gatekeeper.sh/v1alpha1
kind: Config
metadata:
name: config
namespace: "gatekeeper-system"
spec:
sync:
syncOnly:
- group: ""
version: "v1"
kind: "Namespace"
However, when I try to setup everything in the cluster I keep getting:
kubectl apply -f constraint_template.yaml
Error from server: error when creating "constraint_template.yaml": admission webhook "validation.gatekeeper.sh" denied the request: invalid ConstraintTemplate: invalid data references: check refs failed on module {template}: errors (2):
disallowed ref data.kubernetes.namespaces[ns].labels.project
disallowed ref data.kubernetes.namespaces[ns].labels.project
Do you know how to fix that and what I did wrong. Also, in case you happen to know a better approach just let me know.
Alternative to other answer, you may use validation webhook to enfore by any parameter present in the request. Example, name,namespace, annotations, spec etc.
The validation webhook could be a service running in the cluster or External to cluster. This service would essentially make a logical decision based on the logic we put. For every request Sent by user, api server send a review request to the webhook and the validation webhook would either approve or reject the review.
You can read more about it here, more descriptive post by me here.
If you want to enforce this rule on k8s object such as configmap or ingress, I think you can use something like OPA
In Kubernetes, Admission Controllers enforce semantic validation of objects during create, update, and delete operations. With OPA you can enforce custom policies on Kubernetes objects without recompiling or reconfiguring the Kubernetes API server.
reference

How to set the allowed url length limit for a kubernetes nginx(error code: 414, uri too large)

We have deployed our microservices in the k8 cluster and we have also configured ingress resources for them as we are accessing it from outside of k8 cluster. When we make a request with size larger then 2900 character we get back an: error code 414: uri too large. We searched on the internet and found about nginx settings which can help us in solving this problem
Syntax: large_client_header_buffers number size ;
Default: large_client_header_buffers 4 8k;
Context: http, server
Since we are using nginx ingress, We checked the documentation on ingress-nginx online resource but could not find corresponding settings. Can somebody help with this?
Use this page for all nginx-ingress config.
Add the values in the ConfigMap and they will get picked up automatically.
The exact value you are looking for is: large-client-header-buffers

cert manager is failing with Waiting for dns-01 challenge propagation: Could not determine authoritative nameservers

I have created cert-manager on aks-engine using below command
kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v0.12.0/cert-manager.yaml
my certificate spec
issuer spec
Im using nginx as ingress, I could see txt record in the azure dns zone created my azuredns service principle, but not sure what is the issue on nameservers
I ran into the same error... I suspect that it's because I'm using a mix of private and public Azure DNS entries and the record needs to get added to the public entry so letsencrypt can see it, however, cert-manager performs a check that the TXT record is visible before asking letsencrypt to perform the validation... I assume that the default DNS records cert-manager looks at is the private one, and because there's no TXT record there, it gets stuck on this error.
The way around it, as described on cert-manager.io is to override the default DNS using extraArgs (I'm doing this with terraform and helm):
resource "helm_release" "cert_manager" {
name = "cert-manager"
repository = "https://charts.jetstack.io"
chart = "cert-manager"
set {
name = "installCRDs"
value = "true"
}
set {
name = "extraArgs"
value = "{--dns01-recursive-nameservers-only,--dns01-recursive-nameservers=8.8.8.8:53\\,1.1.1.1:53}"
}
}
The issue for me, was that I was missing some annotations in the ingress:
cert-manager.io/cluster-issuer: hydrantid
kubernetes.io/tls-acme: 'true'
In my case I am using hydrantid as the issuer, but most people use letsencrypt I guess.
I had similar error when my certificate was stuck in pending and below is how i resolved it
kubectl get challenges
urChallengeName
then run the following
kubectl patch challenge/urChallengeName -p '{"metadata":{"finalizers":[]}}' --type=merge
and when u do get challenges again it should be gone

How to control vhost_shared_traffic memory K8s nginx ingress?

Background
We run a kubernetes cluster that handles several php/lumen microservices. We started seeing the app php-fpm/nginx reporting 499 status code in it's logs, and it seems to correspond with the client getting a blank response (curl returns curl: (52) Empty reply from server) while the applications log 499.
10.10.x.x - - [09/Mar/2020:18:26:46 +0000] "POST /some/path/ HTTP/1.1" 499 0 "-" "curl/7.65.3"
My understanding is nginx will return the 499 code when the client socket is no longer open/available to return the content to. In this situation that appears to mean something before the nginx/application layer is terminating this connection. Our configuration currently is:
ELB -> k8s nginx ingress -> application
So my thoughts are either ELB or ingress since the application is the one who has no socket left to return to. So i started hitting ingress logs...
Potential core problem?
While looking the the ingress logs i'm seeing quite a few of these:
2020/03/06 17:40:01 [crit] 11006#11006: ngx_slab_alloc() failed: no memory in vhost_traffic_status_zone "vhost_traffic_status"
Potential Solution
I imagine if i gave vhost_traffic_status_zone some more memory at least that error would go away and on to finding the next error.. but I can't seem to find any configmap value or annotation that would allow me to control this. I've checked the docs:
https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/configmap/
https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/annotations/
Thanks in advance for any insight / suggestions / documentation I might be missing!
here is the standard way to look up how to modify the nginx.conf in the ingress controller. After that, I'll link in some info on suggestions on how much memory you should give the zone.
First start by getting the ingress controller version by checking the image version on the deploy
kubectl -n <namespace> get deployment <deployment-name> | grep 'image:'
From there, you can retrieve the code for your version from the following URL. In the following, I will be using version 0.10.2.
https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.10.2
The nginx.conf template can be found at rootfs/etc/nginx/template/nginx.tmpl in the code or /etc/nginx/template/nginx.tmpl on a pod. This can be grepped for the line of interest. I the example case, we find the following line in the nginx.tmpl
vhost_traffic_status_zone shared:vhost_traffic_status:{{ $cfg.VtsStatusZoneSize }};
This gives us the config variable to look up in the code. Our next grep for VtsStatusZoneSize leads us to the lines in internal/ingress/controller/config/config.go
// Description: Sets parameters for a shared memory zone that will keep states for various keys. The cache is shared between all worker processe
// https://github.com/vozlt/nginx-module-vts#vhost_traffic_status_zone
// Default value is 10m
VtsStatusZoneSize string `json:"vts-status-zone-size,omitempty"
This gives us the key "vts-status-zone-size" to be added to the configmap "ingress-nginx-ingress-controller". The current value can be found in the rendered nginx.conf template on a pod at /etc/nginx/nginx.conf.
When it comes to what size you may want to set the zone, there are the docs here that suggest setting it to 2*usedSize:
If the message("ngx_slab_alloc() failed: no memory in vhost_traffic_status_zone") printed in error_log, increase to more than (usedSize * 2).
https://github.com/vozlt/nginx-module-vts#vhost_traffic_status_zone
"usedSize" can be found by hitting the stats page for nginx or through the JSON endpoint. Here is the request to get the JSON version of the stats and if you have jq the path to the value: curl http://localhost:18080/nginx_status/format/json 2> /dev/null | jq .sharedZones.usedSize
Hope this helps.

How can I check if my server is alive with metricbeat, Is it possible?

I've been using elasticsearch, metricbeat and elastalert to watch my server. I have nginx intalled on it that is been used as a reverse proxy and I need to send an to it if nginx drop or return some error, I have already some alerts configured but how can I make a rule to send alert to nginx when it drop or return some error.
Thank a lot
Metricbeat is just for data about the system resources usage. What you need is installing filebeat and activating the nginx module. Then you can use the rule type any of elastalert and filter by fileset.module: nginx and fileset.name: error:
name: your rule name
index: filebeat-*
type: any
filter:
- term:
fileset.module: "nginx"
- term:
fileset.name: "error"
alert:
- "slack"
... # your slack config stuff
realert:
minutes: 1

Resources