Envoy: "upstream connect error or disconnect/reset before headers. reset reason: connection failure" - http

I'm newbie in envoy.
I have been struggling during a week with error below. So my downstream(server which requests for some data/update) receives response:
Status code: 503
Headers:
...
Server:"envoy"
X-Envoy-Response-Code-Details:"upstream_reset_before_response_started{connection_failure}"
X-Envoy-Response-Flags: "UF,URX"
Body: upstream connect error or disconnect/reset before headers. reset reason: connection failure
On the other side, my upstream gets disconnection(context cancelled).
And upstream service doesn't return 503 codes at all.
All network is going by http1.
My envoy.yaml:
admin:
access_log_path: /tmp/admin_access.log
address:
socket_address: { address: 0.0.0.0, port_value: 9901 }
static_resources:
listeners:
- name: listener_0
address:
socket_address: { address: 0.0.0.0, port_value: 80 }
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"#type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: ingress_http
codec_type: http1
route_config:
name: local_route
virtual_hosts:
- name: local_service
domains: [ "*" ]
response_headers_to_add: # added for debugging
- header:
key: x-envoy-response-code-details
value: "%RESPONSE_CODE_DETAILS%"
- header:
key: x-envoy-response-flags
value: "%RESPONSE_FLAGS%"
routes:
- match: # consistent routing
safe_regex:
google_re2: { }
regex: SOME_STRANGE_REGEX_FOR_CONSISTENT_ROUTING
route:
cluster: consistent_cluster
hash_policy:
header:
header_name: ":path"
regex_rewrite:
pattern:
google_re2: { }
regex: SOME_STRANGE_REGEX_FOR_CONSISTENT_ROUTING
substitution: "\\1"
retry_policy: # attempt to avoid 503 errors by retries
retry_on: "connect-failure,refused-stream,unavailable,cancelled,resource-exhausted,retriable-status-codes"
retriable_status_codes: [ 503 ]
num_retries: 3
retriable_request_headers:
- name: ":method"
exact_match: "GET"
- match: { prefix: "/" } # default routing (all routes except consistent)
route:
cluster: default_cluster
retry_policy: # attempt to avoid 503 errors by retries
retry_on: "connect-failure,refused-stream,unavailable,cancelled,resource-exhausted,retriable-status-codes"
retriable_status_codes: [ 503 ]
retry_host_predicate:
- name: envoy.retry_host_predicates.previous_hosts
host_selection_retry_max_attempts: 3
http_filters:
- name: envoy.filters.http.router
clusters:
- name: consistent_cluster
connect_timeout: 0.05s
type: STRICT_DNS
dns_refresh_rate: 1s
dns_lookup_family: V4_ONLY
lb_policy: MAGLEV
health_checks:
- timeout: 1s
interval: 1s
unhealthy_threshold: 1
healthy_threshold: 1
http_health_check:
path: "/health"
load_assignment:
cluster_name: consistent_cluster
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: consistent-host
port_value: 80
- name: default_cluster
connect_timeout: 0.05s
type: STRICT_DNS
dns_refresh_rate: 1s
dns_lookup_family: V4_ONLY
lb_policy: ROUND_ROBIN
health_checks:
- timeout: 1s
interval: 1s
unhealthy_threshold: 1
healthy_threshold: 1
http_health_check:
path: "/health"
outlier_detection: # attempt to avoid 503 errors by ejecting unhealth pods
consecutive_gateway_failure: 1
load_assignment:
cluster_name: default_cluster
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: default-host
port_value: 80
I also tried to add logs:
access_log:
- name: accesslog
typed_config:
"#type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
path: /tmp/http_access.log
log_format:
text_format: "[%START_TIME%] \"%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% %PROTOCOL%\" %RESPONSE_CODE% %CONNECTION_TERMINATION_DETAILS% %RESPONSE_FLAGS% %BYTES_RECEIVED% %BYTES_SENT% %DURATION% %RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)% \"%REQ(X-FORWARDED-FOR)%\" \"%REQ(USER-AGENT)%\" \"%REQ(X-REQUEST-ID)%\" \"%REQ(:AUTHORITY)%\" \"%UPSTREAM_HOST%\"\n"
filter:
status_code_filter:
comparison:
op: GE
value:
default_value: 500
runtime_key: access_log.access_error.status
It gave me nothing, because %CONNECTION_TERMINATION_DETAILS% is always empty("-") and response flags I have seen already from headers in downstream responses.
I increased connect_timeout twice (0.01s -> 0.02s -> 0.05s). It didn't help at all. And other services(by direct routing) work okay with connect timeout 10ms.
BTW everything works nice after redeploy during approximately 20 minutes for sure.
Hope to hear your ideas what it can be and where i should dig into)
P.S: I also receive health check errors sometimes(in logs), but i have no idea why. And everything without envoy worked well(no errors, no timeouts): health checking, direct requests, etc.

I experienced a similar problem when starting envoy as a docker container. In the end, the reason was a missing --network host option in the docker run command which lead to the clusters not being visible from within envoy's docker container. Maybe this helps you, too?

Related

artifactory docker push error unknown: Method Not Allowed

we are using artifactory pro license and installed artifactory through helm on kubernetes.
when we create a docker local repo(The Repository Path Method) and push docker image,
we get 405 method not allowed errror. (docker login/ pull is working normally.)
########## error msg
# docker push art2.bee0dev.lge.com/docker-local/hello-world
e07ee1baac5f: Pushing [==================================================>] 14.85kB
unknown: Method Not Allowed
##########
we are using haproxy load balancer that is used for TLS, in front of Nginx Ingress Controller.
(nginx ingress controller's http nodeport is 31071)
please help us how can we solve the problem.
The artifactory and haproxy settings are as follows.
########## value.yaml
global:
joinKeySecretName: "artbee-stg-joinkey-secret"
masterKeySecretName: "artbee-stg-masterkey-secret"
storageClass: "sa-stg-netapp8300-bee-blk-nonretain"
ingress:
enabled: true
defaultBackend:
enabled: false
hosts: ["art2.bee0dev.lge.com"]
routerPath: /
artifactoryPath: /artifactory/
className: ""
annotations:
kubernetes.io/ingress.class: "nginx"
nginx.ingress.kubernetes.io/proxy-body-size: "0"
nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
nginx.ingress.kubernetes.io/proxy-send-timeout: "600"
nginx.ingress.kubernetes.io/configuration-snippet: |
proxy_pass_header Server;
proxy_set_header X-JFrog-Override-Base-Url https://art2.bee0dev.lge.com;
labels: {}
tls: []
additionalRules: []
## Artifactory license.
artifactory:
name: artifactory
replicaCount: 1
image:
registry: releases-docker.jfrog.io
repository: jfrog/artifactory-pro
# tag:
pullPolicy: IfNotPresent
labels: {}
updateStrategy:
type: RollingUpdate
migration:
enabled: false
customInitContainersBegin: |
- name: "init-mount-permission-setup"
image: "{{ .Values.initContainerImage }}"
imagePullPolicy: "{{ .Values.artifactory.image.pullPolicy }}"
securityContext:
runAsUser: 0
runAsGroup: 0
allowPrivilegeEscalation: false
capabilities:
drop:
- NET_RAW
command:
- 'bash'
- '-c'
- if [ $(ls -la /var/opt/jfrog | grep artifactory | awk -F' ' '{print $3$4}') == 'rootroot' ]; then
echo "mount permission=> root:root";
echo "change mount permission to 1030:1030 " {{ .Values.artifactory.persistence.mountPath }};
chown -R 1030:1030 {{ .Values.artifactory.persistence.mountPath }};
else
echo "already set. No change required.";
ls -la {{ .Values.artifactory.persistence.mountPath }};
fi
volumeMounts:
- mountPath: "{{ .Values.artifactory.persistence.mountPath }}"
name: artifactory-volume
database:
maxOpenConnections: 80
tomcat:
maintenanceConnector:
port: 8091
connector:
maxThreads: 200
sendReasonPhrase: false
extraConfig: 'acceptCount="100"'
customPersistentVolumeClaim: {}
license:
## licenseKey is the license key in plain text. Use either this or the license.secret setting
licenseKey: "???"
secret:
dataKey:
resources:
requests:
memory: "2Gi"
cpu: "1"
limits:
memory: "20Gi"
cpu: "8"
javaOpts:
xms: "1g"
xmx: "12g"
admin:
ip: "127.0.0.1"
username: "admin"
password: "!swiit123"
secret:
dataKey:
service:
name: artifactory
type: ClusterIP
loadBalancerSourceRanges: []
annotations: {}
persistence:
mountPath: "/var/opt/jfrog/artifactory"
enabled: true
accessMode: ReadWriteOnce
size: 100Gi
type: file-system
storageClassName: "sa-stg-netapp8300-bee-blk-nonretain"
nginx:
enabled: false
##########
########## haproxy config
frontend cto-stage-http-frontend
bind 10.185.60.75:80
bind 10.185.60.76:80
bind 10.185.60.201:80
bind 10.185.60.75:443 ssl crt /etc/haproxy/ssl/bee0dev.lge.com.pem ssl-min-ver TLSv1.2 ciphers ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256
bind 10.185.60.76:443 ssl crt /etc/haproxy/ssl/bee0dev.lge.com.pem ssl-min-ver TLSv1.2 ciphers ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256
bind 10.185.60.201:443 ssl crt /etc/haproxy/ssl/bee0dev.lge.com.pem ssl-min-ver TLSv1.2 ciphers ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256
mode http
option forwardfor
option accept-invalid-http-request
acl k8s-cto-stage hdr_end(host) -i -f /etc/haproxy/web-ide/cto-stage
use_backend k8s-cto-stage-http if k8s-cto-stage
backend k8s-cto-stage-http
mode http
redirect scheme https if !{ ssl_fc }
option tcp-check
balance roundrobin
server lgestgbee04v 10.185.60.78:31071 check fall 3 rise 2
##########
The request doesn't seem to be landing at the correct endpoint. Please remove the semi-colon from the docker command and retry again.
docker push art2.bee0dev.lge.com;/docker-local/hello-world
Try executing it like below,
docker push art2.bee0dev.lge.com/docker-local/hello-world

Struggling to make gRPC-web and HTTPS work

I've got a SPA web app that uses gRPC web and envoy to proxy back to a server that speaks gRPC. This all works great, no problems.
I'm trying to make this secure using HTTPS/TLS and just keep running into issues and can't make it work.
Our setup is this:
Web Client SPA app (accessed from web node.js server also running on the lahinch server. URL is https://lahinch.mycorp.com ). Web app connects to the envoy proxy using this address "https://coxos.mycorp.COM:8090"
\
Envoy Proxy (coxos - 172.16.0.116) - listens on port 8090 and proxies to port 50251
\
\
Backend gRPC server (lahinch - 172.16.0.109) - listens on port 50251
From reading the envoy docs, the web client is downstream and the backend server is upstream.
Here is my envoy.yaml file
admin:
access_log_path: /tmp/admin_access.log
address:
socket_address:
address: 0.0.0.0
port_value: 9901
static_resources:
listeners:
- name: listener_0
address:
socket_address:
address: 0.0.0.0
port_value: 8090
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
'#type': >-
type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: ingress_http
access_log:
- name: envoy.access_loggers.file
typed_config:
'#type': >-
type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
path: /dev/stdout
- name: envoy.access_loggers.http_grpc
typed_config:
'#type': >-
type.googleapis.com/envoy.extensions.access_loggers.grpc.v3.HttpGrpcAccessLogConfig
common_config:
log_name: envoygrpclog
grpc_service:
envoy_grpc:
cluster_name: controlweb_backendservice
transport_api_version: V3
route_config:
name: local_route
virtual_hosts:
- name: local_service
domains:
- '*'
routes:
- match:
prefix: /
route:
cluster: controlweb_backendservice
hash_policy:
- header:
header_name: x-session-hash
max_stream_duration:
grpc_timeout_header_max: 300s
cors:
allow_origin_string_match:
- safe_regex:
google_re2: {}
regex: .*
allow_methods: 'GET, PUT, DELETE, POST, OPTIONS'
allow_headers: >-
keep-alive,user-agent,cache-control,content-type,content-transfer-encoding,grpc-status-details-bin,x-accept-content-transfer-encoding,x-accept-response-streaming,x-user-agent,x-grpc-web,grpc-timeout,access-token,x-session-hash
expose_headers: >-
grpc-status-details-bin,grpc-status,grpc-message,access-token
max_age: '1728000'
http_filters:
- name: envoy.filters.http.grpc_web
typed_config:
'#type': >-
type.googleapis.com/envoy.extensions.filters.http.grpc_web.v3.GrpcWeb
- name: envoy.filters.http.cors
typed_config:
'#type': >-
type.googleapis.com/envoy.extensions.filters.http.cors.v3.Cors
- name: envoy.filters.http.router
typed_config:
'#type': >-
type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
transport_socket:
name: envoy.transport_sockets.tls
typed_config:
# https://www.envoyproxy.io/docs/envoy/v1.17.0/api-v3/extensions/transport_sockets/tls/v3/tls.proto#extensions-transport-sockets-tls-v3-downstreamtlscontext
"#type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
common_tls_context:
tls_certificates:
- certificate_chain:
# Certificate must be PEM-encoded
filename: /etc/lahinch.pem
private_key:
filename: /etc/lahinch.key.pem
validation_context:
trusted_ca:
filename: /etc/ssl/certs/ZZZ-CA256.pem
clusters:
- name: controlweb_backendservice
type: LOGICAL_DNS
connect_timeout: 0.25s
dns_lookup_family: V4_ONLY
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: cluster_controlweb_backendservice
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: lahinch.mycorp.com
port_value: 50251
http2_protocol_options: {}
transport_socket:
name: envoy.transport_sockets.tls
typed_config:
'#type': >-
type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
common_tls_context:
tls_certificates:
- certificate_chain:
filename: /etc/lahinch.pem
private_key:
filename: /etc/lahinch.key.pem
validation_context:
trusted_ca:
filename: /etc/ssl/certs/ZZZ-CA256.pem
Using this, I'm getting the following in the envoy log when I try and run my web app:
[2021-04-09 22:08:33.939][17][debug][conn_handler] [source/server/connection_handler_impl.cc:501] [C2] new connection
[2021-04-09 22:08:33.945][17][debug][http] [source/common/http/conn_manager_impl.cc:254] [C2] new stream
[2021-04-09 22:08:33.945][17][debug][http] [source/common/http/conn_manager_impl.cc:886] [C2][S3055347406573314092] request headers complete (end_stream=false):
':authority', 'coxos.mycorp.com:8090'
':path', '/WanderAuth.HostService/LogIn'
':method', 'POST'
'connection', 'keep-alive'
'content-length', '124'
'accept', 'application/grpc-web-text'
'x-user-agent', 'grpc-web-javascript/0.1'
'access-token', ''
'x-grpc-web', '1'
'user-agent', 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36 Edg/89.0.774.68'
'grpc-timeout', '90000m'
'content-type', 'application/grpc-web-text'
'origin', 'https://lahinch.mycorp.com'
'sec-fetch-site', 'same-site'
'sec-fetch-mode', 'cors'
'sec-fetch-dest', 'empty'
'referer', 'https://lahinch.mycorp.com/'
'accept-encoding', 'gzip, deflate, br'
'accept-language', 'en-US,en;q=0.9'
[2021-04-09 22:08:33.946][17][debug][router] [source/common/router/router.cc:425] [C2][S3055347406573314092] cluster 'controlweb_backendservice' match for URL '/WanderAuth.HostService/LogIn'
[2021-04-09 22:08:33.946][17][debug][router] [source/common/router/router.cc:582] [C2][S3055347406573314092] router decoding headers:
':authority', 'coxos.mycorp.com:8090'
':path', '/WanderAuth.HostService/LogIn'
':method', 'POST'
':scheme', 'https'
'accept', 'application/grpc-web-text'
'x-user-agent', 'grpc-web-javascript/0.1'
'access-token', ''
'x-grpc-web', '1'
'user-agent', 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36 Edg/89.0.774.68'
'grpc-timeout', '90000m'
'content-type', 'application/grpc'
'origin', 'https://lahinch.mycorp.com'
'sec-fetch-site', 'same-site'
'sec-fetch-mode', 'cors'
'sec-fetch-dest', 'empty'
'referer', 'https://lahinch.mycorp.com/'
'accept-encoding', 'gzip, deflate, br'
'accept-language', 'en-US,en;q=0.9'
'x-forwarded-proto', 'https'
'x-request-id', 'a4a041ab-dc29-4ed7-a342-90ac03b3be3c'
'te', 'trailers'
'grpc-accept-encoding', 'identity'
'x-envoy-expected-rq-timeout-ms', '15000'
[2021-04-09 22:08:33.946][17][debug][pool] [source/common/http/conn_pool_base.cc:79] queueing stream due to no available connections
[2021-04-09 22:08:33.946][17][debug][pool] [source/common/conn_pool/conn_pool_base.cc:106] creating a new connection
[2021-04-09 22:08:33.946][17][debug][client] [source/common/http/codec_client.cc:41] [C3] connecting
[2021-04-09 22:08:33.946][17][debug][connection] [source/common/network/connection_impl.cc:860] [C3] connecting to 172.16.0.109:50251
[2021-04-09 22:08:33.946][17][debug][connection] [source/common/network/connection_impl.cc:876] [C3] connection in progress
[2021-04-09 22:08:33.946][17][debug][http2] [source/common/http/http2/codec_impl.cc:1184] [C3] updating connection-level initial window size to 268435456
[2021-04-09 22:08:33.946][17][debug][http] [source/common/http/filter_manager.cc:755] [C2][S3055347406573314092] request end stream
[2021-04-09 22:08:33.947][17][debug][connection] [source/common/network/connection_impl.cc:666] [C3] connected
[2021-04-09 22:08:33.947][17][debug][connection] [source/extensions/transport_sockets/tls/ssl_socket.cc:224] [C3] TLS error: 268435703:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER
[2021-04-09 22:08:33.948][17][debug][connection] [source/common/network/connection_impl.cc:241] [C3] closing socket: 0
[2021-04-09 22:08:33.948][17][debug][connection] [source/extensions/transport_sockets/tls/ssl_socket.cc:224] [C3] TLS error: 268435703:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER
[2021-04-09 22:08:33.948][17][debug][client] [source/common/http/codec_client.cc:99] [C3] disconnect. resetting 0 pending requests
[2021-04-09 22:08:33.948][17][debug][pool] [source/common/conn_pool/conn_pool_base.cc:343] [C3] client disconnected, failure reason: TLS error: 268435703:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER
[2021-04-09 22:08:33.948][17][debug][router] [source/common/router/router.cc:1026] [C2][S3055347406573314092] upstream reset: reset reason: connection failure, transport failure reason: TLS error: 268435703:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER
[2021-04-09 22:08:33.948][17][debug][http] [source/common/http/filter_manager.cc:839] [C2][S3055347406573314092] Sending local reply with details upstream_reset_before_response_started{connection failure,TLS error: 268435703:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER}
[2021-04-09 22:08:33.948][17][debug][http] [source/common/http/conn_manager_impl.cc:1484] [C2][S3055347406573314092] encoding headers via codec (end_stream=false):
':status', '503'
'content-length', '190'
'content-type', 'application/grpc-web-text+proto'
'access-control-allow-origin', 'https://lahinch.mycorp.com'
'access-control-expose-headers', 'grpc-status-details-bin,grpc-status,grpc-message,access-token'
'date', 'Fri, 09 Apr 2021 22:08:33 GMT'
'server', 'envoy'
[2021-04-09 22:08:36.139][9][debug][upstream] [source/common/upstream/logical_dns_cluster.cc:101] starting async DNS resolution for lahinch.mycorp.com
[2021-04-09 22:08:36.139][9][debug][main] [source/server/server.cc:199] flushing stats
[2021-04-09 22:08:36.141][9][debug][upstream] [source/common/upstream/logical_dns_cluster.cc:109] async DNS resolution complete for lahinch.mycorp.com
[2021-04-09 22:08:36.141][9][debug][upstream] [source/common/upstream/logical_dns_cluster.cc:155] DNS refresh rate reset for lahinch.mycorp.com, refresh rate 5000 ms
So the error appears to be this: TLS error: 268435703:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER
I've looked up this error and it appears to be be related to security and certificates. But I haven't been able to find a good answer as to what I'm doing wrong.
When it comes to the required certs, should the certs be the same that is used by client(downstream), the proxy or the backend(upstream server) or both? I've tried using different certs for the different servers and the same certs for the servers and I still get the same error.

Simple gRPC envoy configuration

I'm trying to setup a envoy proxy as a gRPC fron end, and can't get it to work, so I'm trying to get to as simple a test setup as possible and build from there, but I can't get that to work either. Here's what my test setup looks like:
Python server (slightly modified gRPC example code)
# greeter_server.py
from concurrent import futures
import time
import grpc
import helloworld_pb2
import helloworld_pb2_grpc
_ONE_DAY_IN_SECONDS = 60 * 60 * 24
class Greeter(helloworld_pb2_grpc.GreeterServicer):
def SayHello(self, request, context):
return helloworld_pb2.HelloReply(message='Hello, %s!' % request.name)
def serve():
server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
helloworld_pb2_grpc.add_GreeterServicer_to_server(Greeter(), server)
server.add_insecure_port('[::]:8081')
server.start()
try:
while True:
time.sleep(_ONE_DAY_IN_SECONDS)
except KeyboardInterrupt:
server.stop(0)
if __name__ == '__main__':
serve()
Python client (slightly modified gRPC example code)
from __future__ import print_function
import grpc
import helloworld_pb2
import helloworld_pb2_grpc
def run():
# NOTE(gRPC Python Team): .close() is possible on a channel and should be
# used in circumstances in which the with statement does not fit the needs
# of the code.
with grpc.insecure_channel('localhost:9911') as channel:
stub = helloworld_pb2_grpc.GreeterStub(channel)
response = stub.SayHello(helloworld_pb2.HelloRequest(name='you'))
print("Greeter client received: " + response.message)
if __name__ == '__main__':
run()
And then my two envoy yaml files:
# envoy-hello-server.yaml
static_resources:
listeners:
- address:
socket_address:
address: 0.0.0.0
port_value: 8811
filter_chains:
- filters:
- name: envoy.http_connection_manager
typed_config:
"#type": type.googleapis.com/envoy.config.filter.network.http_connection_manager.v2.HttpConnectionManager
codec_type: auto
stat_prefix: ingress_http
access_log:
- name: envoy.file_access_log
typed_config:
"#type": type.googleapis.com/envoy.config.accesslog.v2.FileAccessLog
path: "/dev/stdout"
route_config:
name: local_route
virtual_hosts:
- name: backend
domains:
- "*"
routes:
- match:
prefix: "/"
grpc: {}
route:
cluster: hello_grpc_service
http_filters:
- name: envoy.router
typed_config: {}
clusters:
- name: hello_grpc_service
connect_timeout: 0.250s
type: strict_dns
lb_policy: round_robin
http2_protocol_options: {}
load_assignment:
cluster_name: hello_grpc_service
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: hello_grpc_service
port_value: 8081
admin:
access_log_path: "/tmp/envoy_hello_server.log"
address:
socket_address:
address: 0.0.0.0
port_value: 8881
and
# envoy-hello-client.yaml
static_resources:
listeners:
- address:
socket_address:
address: 0.0.0.0
port_value: 9911
filter_chains:
- filters:
- name: envoy.http_connection_manager
typed_config:
"#type": type.googleapis.com/envoy.config.filter.network.http_connection_manager.v2.HttpConnectionManager
codec_type: auto
add_user_agent: true
access_log:
- name: envoy.file_access_log
typed_config:
"#type": type.googleapis.com/envoy.config.accesslog.v2.FileAccessLog
path: "/dev/stdout"
stat_prefix: egress_http
common_http_protocol_options:
idle_timeout: 0.840s
use_remote_address: true
route_config:
name: local_route
virtual_hosts:
- name: backend
domains:
- grpc
routes:
- match:
prefix: "/"
route:
cluster: backend-proxy
http_filters:
- name: envoy.router
typed_config: {}
clusters:
- name: backend-proxy
type: logical_dns
dns_lookup_family: V4_ONLY
lb_policy: round_robin
connect_timeout: 0.250s
http_protocol_options: {}
load_assignment:
cluster_name: backend-proxy
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: hello_grpc_service
port_value: 8811
admin:
access_log_path: "/tmp/envoy_hello_client.log"
address:
socket_address:
address: 0.0.0.0
port_value: 9991
Now, what I expect this would allow is something like hello_client.py (port 9911) -> envoy (envoy-hello-client.yaml) -> envoy (envoy-hello-server.yaml) -> hello_server.py (port 8081)
Instead, what I get is an error from the python client:
$ python3 greeter_client.py
Traceback (most recent call last):
File "greeter_client.py", line 35, in <module>
run()
File "greeter_client.py", line 30, in run
response = stub.SayHello(helloworld_pb2.HelloRequest(name='you'))
File "/usr/lib/python3/dist-packages/grpc/_channel.py", line 533, in __call__
return _end_unary_response_blocking(state, call, False, None)
File "/usr/lib/python3/dist-packages/grpc/_channel.py", line 467, in _end_unary_response_blocking
raise _Rendezvous(state, None, None, deadline)
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
status = StatusCode.UNIMPLEMENTED
details = ""
debug_error_string = "{"created":"#1594770575.642032812","description":"Error received from peer","file":"src/core/lib/surface/call.cc","file_line":1017,"grpc_message":"","grpc_status":12}"
>
And in the envoy client log:
[2020-07-14 16:22:10.407][16935][info][main] [external/envoy/source/server/server.cc:652] starting main dispatch loop
[2020-07-14 16:23:25.441][16935][info][runtime] [external/envoy/source/common/runtime/runtime_impl.cc:524] RTDS has finished initialization
[2020-07-14 16:23:25.441][16935][info][upstream] [external/envoy/source/common/upstream/cluster_manager_impl.cc:182] cm init: all clusters initialized
[2020-07-14 16:23:25.441][16935][info][main] [external/envoy/source/server/server.cc:631] all clusters initialized. initializing init manager
[2020-07-14 16:23:25.441][16935][info][config] [external/envoy/source/server/listener_manager_impl.cc:844] all dependencies initialized. starting workers
[2020-07-14 16:23:25.441][16935][warning][main] [external/envoy/source/server/server.cc:537] there is no configured limit to the number of allowed active connections. Set a limit via the runtime key overload.global_downstream_max_connections
[2020-07-14T23:49:35.641Z] "POST /helloworld.Greeter/SayHello HTTP/2" 200 NR 0 0 0 - "10.0.0.56" "grpc-python/1.16.1 grpc-c/6.0.0 (linux; chttp2; gao)" "aa72310a-3188-46b2-8cbf-9448b074f7ae" "localhost:9911" "-"
And nothing in the server log.
Also, weirdly, this is an almost one second delay between when I run the python client and when the log message shows up in the client envoy.
What am I missing to make these two scripts talk via envoy?
I know I'm bit late, hope this helps someone. Since you are grpc server is running in the same host you could specify hostname to be host.docker.internal (previous docker.for.mac.localhost deprecated from docker v18.03.0)
In your case if you are running in a dockerized environment you could do the following:
Envoy version: 1.13+
clusters:
- name: backend-proxy
type: logical_dns
dns_lookup_family: V4_ONLY
lb_policy: round_robin
connect_timeout: 0.250s
http_protocol_options: {}
load_assignment:
cluster_name: backend-proxy
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: host.docker.internal
port_value: 8811
hello_grpc_service won't be resolved to IP in dockerized environment.
Note: you could enable envoy trace log level for detailed logs

Setting up Ingress (Kubernetes)

I want to set up an Ingress, which routes traffic to my underlying Services. Unfortunately, I get an error when I deploy my ingress-controller-deployment.yaml and I don't know why... The pod with the ingress-controller crashes immediately, with the error message "CrashLoopBackOff".
With my understanding the Ingress-Control has to be deployed in a Pod and this pod can be accessed through the ingress-svc. The ingress-svc seems to work, but the Pod crashes. After the ingress-controller works I need an additional file that defines the routes and everything. But I don't see the point of continuing with out a working and deployable ingress-controller.
Pod description:
Name: ingress-controller-7749c785f-x94ll
Namespace: ingress
Node: gke-cluster-1-default-pool-8484e77d-r4wp/10.128.0.2
Start Time: Thu, 26 Apr 2018 14:25:04 +0200
Labels: k8s-app=nginx-ingress-lb
pod-template-hash=330573419
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"ingress","name":"ingress-controller-7749c785f","uid":"d8ff0a6d-494c-11e8-a840
-420...
Status: Running
IP: 10.8.0.14
Created By: ReplicaSet/ingress-controller-7749c785f
Controlled By: ReplicaSet/ingress-controller-7749c785f
Containers:
nginx-ingress-controller:
Container ID: docker://5654c7dffc44510132cba303d66ee570280f2cec235e4d4fa6ef8ad543e0c91d
Image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.9.0
Image ID: docker-pullable://quay.io/kubernetes-ingress-controller/nginx-ingress-controller#sha256:39cc6ce23e5bcdf8aa78bc28bbcfe0999e449bf99fe2e8d60984b417facc5cd4
Ports: 80/TCP, 443/TCP
Args:
/nginx-ingress-controller
--admin-backend-svc=$(POD_NAMESPACE)/admin-backend
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 2
Started: Thu, 26 Apr 2018 14:26:57 +0200
Finished: Thu, 26 Apr 2018 14:26:57 +0200
Ready: False
Restart Count: 4
Liveness: http-get http://:10254/healthz delay=10s timeout=5s period=10s #success=1 #failure=3
Environment:
POD_NAME: ingress-controller-7749c785f-x94ll (v1:metadata.name)
POD_NAMESPACE: ingress (v1:metadata.namespace)
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-plbss (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
default-token-plbss:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-plbss
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.alpha.kubernetes.io/notReady:NoExecute for 300s
node.alpha.kubernetes.io/unreachable:NoExecute for 300s
Ingress-controller-deployment.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: ingress-controller
spec:
replicas: 1
revisionHistoryLimit: 3
template:
metadata:
labels:
k8s-app: nginx-ingress-lb
spec:
containers:
- args:
- /nginx-ingress-controller
- "--admin-backend-svc=$(POD_NAMESPACE)/admin-backend"
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
image: "quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.9.0"
imagePullPolicy: Always
livenessProbe:
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 10
timeoutSeconds: 5
name: nginx-ingress-controller
ports:
- containerPort: 80
name: http
protocol: TCP
- containerPort: 443
name: https
protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
name: ingress-svc
spec:
type: LoadBalancer
ports:
- name: http
port: 80
targetPort: http
- name: https
port: 443
targetPort: https
selector:
k8s-app: nginx-ingress-lb
The issue is the args. The args on one of mine are
args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/default-http-backend
- --configmap=$(POD_NAMESPACE)/nginx-configuration
- --tcp-services-configmap=$(POD_NAMESPACE)/tcp-services
- --udp-services-configmap=$(POD_NAMESPACE)/udp-services
- --publish-service=$(POD_NAMESPACE)/ingress-nginx
- --annotations-prefix=nginx.ingress.kubernetes.io
I had also created the config maps for configuration, tcp and udp.

k8s ngnix container return json response

I have a k8s cluster, among other things running an nginx.
when I do curl -v <url> I get
HTTP/1.1 200 OK
< Content-Type: text/html
< Date: Fri, 24 Mar 2017 15:25:27 GMT
< Server: nginx
< Strict-Transport-Security: max-age=15724800; includeSubDomains; preload
< Content-Length: 0
< Connection: keep-alive
<
* Curl_http_done: called premature == 0
* Connection #0 to host <url> left intact
however when I do curl -v <url> -H 'Accept: application/json' I get
< HTTP/1.1 200 OK
< Content-Type: text/html
< Date: Fri, 24 Mar 2017 15:26:10 GMT
< Server: nginx
< Strict-Transport-Security: max-age=15724800; includeSubDomains; preload
< Content-Length: 0
< Connection: keep-alive
<
* Curl_http_done: called premature == 0
* Connection #0 to host <url> left intact
* Could not resolve host: application
* Closing connection 1
curl: (6) Could not resolve host: application
My task is to get the request to return a json not html.
To my understanding I have to create an ingress-controller and modify the ngnix.conf somehow, I've been trying for a few days now but can't get it right. Any kind of help would be most appreciated.
The following are of the yaml files I've been using:
configmap:
apiVersion: v1
data:
server-tokens: "false"
proxy-body-size: "4110m"
server-name-hash-bucket-size: "128"
kind: ConfigMap
metadata:
name: nginx-load-balancer-conf
labels:
app: nginx-ingress-lb
daemonset:
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: nginx-ingress-lb
labels:
app: nginx-ingress-lb
spec:
template:
metadata:
labels:
name: nginx-ingress-lb
app: nginx-ingress-lb
spec:
terminationGracePeriodSeconds: 60
nodeSelector:
NodeType: worker
containers:
- image: gcr.io/google_containers/nginx-ingress-controller:0.9.0-beta.1
name: nginx-ingress-lb
imagePullPolicy: Always
readinessProbe:
httpGet:
path: /healthz
port: 10254
scheme: HTTP
livenessProbe:
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 10
timeoutSeconds: 1
# use downward API
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
ports:
- containerPort: 80
hostPort: 80
- containerPort: 443
hostPort: 443
args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/default-http-backend
- --configmap=$(POD_NAMESPACE)/nginx-load-balancer-conf
deployment:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: default-http-backend
labels:
app: default-http-backend
spec:
replicas: 2
template:
metadata:
labels:
app: default-http-backend
spec:
terminationGracePeriodSeconds: 60
containers:
- name: default-http-backend
# Any image is permissable as long as:
# 1. It serves a 404 page at /
# 2. It serves 200 on a /healthz endpoint
image: gcr.io/google_containers/defaultbackend:1.2
livenessProbe:
httpGet:
path: /healthz
port: 8080
scheme: HTTP
initialDelaySeconds: 30
timeoutSeconds: 5
ports:
- containerPort: 8080
resources:
limits:
cpu: 100m
memory: 20Mi
requests:
cpu: 100m
memory: 20Mi
service:
apiVersion: v1
kind: Service
metadata:
name: default-http-backend
labels:
app: default-http-backend
spec:
selector:
app: default-http-backend
ports:
- port: 80
targetPort: 8080
Remove the space after colon in curl -v <url> -H 'Accept: application/json'
The error message Could not resolve host: application means that it's taking application/json as the URL, instead of a header.
There are two things:
Exposing your app
Making your app return json
The ingess is only relevant to expose your app. And that is not the only option, you can use service (type Load balancer, for example) to achieve that too on most cloud providers. So, I'd keep it simple and not use ingress for now, until you solve the second problem.
As it has been explained, your curl has a syntax problem and that's why it shows curl: (6) Could not resolve host: application.
The other thing is fixing that won't make your app return json. And this is because you are only saying you accept json with that header. But if you want your app to return json, then you need to write it on your app. nginx can't guess how you want to map your HTML to json. There is much no other way than writting it, at least that I know of :-/

Resources