We have been trying to Secure Gateways with SIMPLE TLS for our gRPC Backend which is deployed in Minikube (minikube version: v1.25.2) for now by following this link.
We were able to successfully access the gRPC service (gRPC server with .NET 6) over plaintext through Istio Ingress Gateway using grpcurl client.
But while we tried to use SIMPLE TLS, we have been experiencing -
ERROR:
Code: Unavailable
Message: upstream connect error or disconnect/reset before headers. reset reason: remote reset
Please find the steps -
Created a certificate and a private key for sc-imcps-bootstrap-lb.example.com (Sample Domain for gRPC Server for Minikube)
$ openssl req -out sc-imcps-bootstrap-lb.example.com.csr -newkey rsa:2048 -nodes -keyout sc-imcps-bootstrap-lb.example.com.key -config sc-imcps-bootstrap-lb.cnf
sc-imcps-bootstrap-lb.cnf
[req]
distinguished_name = req_distinguished_name
prompt = no
[req_distinguished_name]
O = sc-imcps organization
OU = R&D
CN = sc-imcps-bootstrap-lb.example.com
$ openssl x509 -req -sha256 -days 365 -CA example.com.crt -CAkey example.com.key -set_serial 0 -in sc-imcps-bootstrap-lb.example.com.csr -out sc-imcps-bootstrap-lb.example.com.crt -extfile v3.ext
v3.ext:
subjectAltName = #alt_names
[alt_names]
IP.1 = 10.97.36.53
DNS.1 = sc-imcps-bootstrap-lb.example.com
Create kubernetes secrets by following this command -
$ kubectl create -n istio-system secret tls sc-imcps-bootstrap-lb-credential --key=sc-imcps-bootstrap-lb.example.com.key --cert=sc-imcps-bootstrap-lb.example.com.crt
Created Gateway manifest. (kubectl apply -n foo -f gateway.yaml) [gateway.yaml is attached]
Configure the gateway's traffic routes. by creating VirtualService definition [virtualservice.yaml is attached]
Added Host Entry to C:\Windows\System32\drivers\etc\hosts file -
10.97.36.53 sc-imcps-bootstrap-lb.example.com
Client execution from host -
$ grpcurl -v -H Host:sc-imcps-bootstrap-lb.example.com -d '{"AppName": "SC", "AppVersion": 1, "PID": 8132, "ContainerID": "asd-2", "CloudInternal": true}' -cacert example.com.crt -proto imcps.proto sc-imcps-bootstrap-lb.example.com:443 imcps.IMCPS/Init
RESULT:
Resolved method descriptor:
// Sends a greeting
rpc Init ( .imcps.ClientInfo ) returns ( .imcps.InitOutput );
Request metadata to send:
(empty)
Response headers received:
(empty)
Response trailers received:
content-type: application/grpc
date: Tue, 18 Oct 2022 10:32:07 GMT
server: istio-envoy
x-envoy-upstream-service-time: 46
Sent 1 request and received 0 responses
ERROR:
Code: Unavailable
Message: upstream connect error or disconnect/reset before headers. reset reason: remote reset
NOTE:
$ istioctl version
client version: 1.15.0
control plane version: 1.15.0
data plane version: 1.15.0 (5 proxies)
Gateway :
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: sc-imcps-gateway
spec:
selector:
istio: ingressgateway # use istio default ingress gateway
servers:
- port:
number: 443
name: https
protocol: HTTPS
tls:
mode: SIMPLE
credentialName: sc-imcps-bootstrap-lb-credential
hosts:
- sc-imcps-bootstrap-lb.example.com
Virtual Service:
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: sc-imcps-bootstrap-route
spec:
hosts:
- sc-imcps-bootstrap-lb.example.com
gateways:
- sc-imcps-gateway
http:
- match:
- uri:
prefix: /imcps.IMCPS/Init
route:
- destination:
host: sc-imcps-bootstrap-svc
port:
number: 17080
Please find the logs from istio-proxy container from gRPC backend server pod -
2022-10-18T10:04:29.412448Z debug envoy http [C190] new stream
2022-10-18T10:04:29.412530Z debug envoy http [C190][S8764333332205046325] request headers complete (end_stream=false):
':method', 'POST'
':scheme', 'https'
':path', '/imcps.IMCPS/Init'
':authority', 'sc-imcps-bootstrap-lb.example.com:443'
'content-type', 'application/grpc'
'user-agent', 'grpcurl/v1.8.6 grpc-go/1.44.1-dev'
'te', 'trailers'
'x-forwarded-for', '10.88.0.1'
'x-forwarded-proto', 'https'
'x-envoy-internal', 'true'
'x-request-id', '0d9b8e43-da2e-4f99-bbd8-a5c0c56f799f'
'x-envoy-decorator-operation', 'sc-imcps-bootstrap-svc.foo.svc.cluster.local:17080/imcps.IMCPS/Init*'
'x-envoy-peer-metadata', 'ChQKDkFQUF9DT05UQUlORVJTEgIaAAoaCgpDTFVTVEVSX0lEEgwaCkt1YmVybmV0ZXMKHAoMSU5TVEFOQ0VfSVBTEgwaCjEwLjg4LjAuNTMKGQoNSVNUSU9fVkVSU0lPThIIGgYxLjE1LjAKvwMKBkxBQkVMUxK0AyqxAwodCgNhcHASFhoUaXN0aW8taW5ncmVzc2dhdGV3YXkKEwoFY2hhcnQSChoIZ2F0ZXdheXMKFAoIaGVyaXRhZ2USCBoGVGlsbGVyCjYKKWluc3RhbGwub3BlcmF0b3IuaXN0aW8uaW8vb3duaW5nLXJlc291cmNlEgkaB3Vua25vd24KGQoFaXN0aW8SEBoOaW5ncmVzc2dhdGV3YXkKGQoMaXN0aW8uaW8vcmV2EgkaB2RlZmF1bHQKMAobb3BlcmF0b3IuaXN0aW8uaW8vY29tcG9uZW50EhEaD0luZ3Jlc3NHYXRld2F5cwohChFwb2QtdGVtcGxhdGUtaGFzaBIMGgo1ODVkNjQ1ODU1ChIKB3JlbGVhc2USBxoFaXN0aW8KOQofc2VydmljZS5pc3Rpby5pby9jYW5vbmljYWwtbmFtZRIWGhRpc3Rpby1pbmdyZXNzZ2F0ZXdheQovCiNzZXJ2aWNlLmlzdGlvLmlvL2Nhbm9uaWNhbC1yZXZpc2lvbhIIGgZsYXRlc3QKIgoXc2lkZWNhci5pc3Rpby5pby9pbmplY3QSBxoFZmFsc2UKGgoHTUVTSF9JRBIPGg1jbHVzdGVyLmxvY2FsCi8KBE5BTUUSJxolaXN0aW8taW5ncmVzc2dhdGV3YXktNTg1ZDY0NTg1NS1icmt4NAobCglOQU1FU1BBQ0USDhoMaXN0aW8tc3lzdGVtCl0KBU9XTkVSElQaUmt1YmVybmV0ZXM6Ly9hcGlzL2FwcHMvdjEvbmFtZXNwYWNlcy9pc3Rpby1zeXN0ZW0vZGVwbG95bWVudHMvaXN0aW8taW5ncmVzc2dhdGV3YXkKFwoRUExBVEZPUk1fTUVUQURBVEESAioACicKDVdPUktMT0FEX05BTUUSFhoUaXN0aW8taW5ncmVzc2dhdGV3YXk='
'x-envoy-peer-metadata-id', 'router~10.88.0.53~istio-ingressgateway-585d645855-brkx4.istio-system~istio-system.svc.cluster.local'
'x-envoy-attempt-count', '1'
'x-b3-traceid', '17b50b6247fe2fcbbc2b2057ef4db96d'
'x-b3-spanid', 'bc2b2057ef4db96d'
'x-b3-sampled', '0'
2022-10-18T10:04:29.412567Z debug envoy connection [C190] current connecting state: false
2022-10-18T10:04:29.412674Z debug envoy router [C190][S8764333332205046325] cluster 'inbound|17080||' match for URL '/imcps.IMCPS/Init'
2022-10-18T10:04:29.412692Z debug envoy upstream transport socket match, socket default selected for host with address 10.244.120.108:17080
2022-10-18T10:04:29.412696Z debug envoy upstream Created host 10.244.120.108:17080.
2022-10-18T10:04:29.412729Z debug envoy upstream addHost() adding 10.244.120.108:17080
2022-10-18T10:04:29.412784Z debug envoy upstream membership update for TLS cluster inbound|17080|| added 1 removed 0
2022-10-18T10:04:29.412789Z debug envoy upstream re-creating local LB for TLS cluster inbound|17080||
2022-10-18T10:04:29.412742Z debug envoy router [C190][S8764333332205046325] router decoding headers:
':method', 'POST'
':scheme', 'https'
':path', '/imcps.IMCPS/Init'
':authority', 'sc-imcps-bootstrap-lb.example.com:443'
'content-type', 'application/grpc'
'user-agent', 'grpcurl/v1.8.6 grpc-go/1.44.1-dev'
'te', 'trailers'
'x-forwarded-for', '10.88.0.1'
'x-forwarded-proto', 'https'
'x-request-id', '0d9b8e43-da2e-4f99-bbd8-a5c0c56f799f'
'x-envoy-attempt-count', '1'
'x-b3-traceid', '17b50b6247fe2fcbbc2b2057ef4db96d'
'x-b3-spanid', 'bc2b2057ef4db96d'
'x-b3-sampled', '0'
'x-envoy-internal', 'true'
'x-forwarded-client-cert', 'By=spiffe://cluster.local/ns/foo/sa/default;Hash=dda6034f03e05bbb9d0183b80583ee9b5842670599dd86827c8f8b6a74060fa0;Subject="";URI=spiffe://cluster.local/ns/istio-system/sa/istio-ingressgateway-service-account'
2022-10-18T10:04:29.412802Z debug envoy upstream membership update for TLS cluster inbound|17080|| added 1 removed 0
2022-10-18T10:04:29.412804Z debug envoy upstream re-creating local LB for TLS cluster inbound|17080||
2022-10-18T10:04:29.412809Z debug envoy pool queueing stream due to no available connections (ready=0 busy=0 connecting=0)
2022-10-18T10:04:29.412813Z debug envoy pool trying to create new connection
2022-10-18T10:04:29.412816Z debug envoy pool creating a new connection (connecting=0)
2022-10-18T10:04:29.412869Z debug envoy http2 [C320] updating connection-level initial window size to 268435456
2022-10-18T10:04:29.412873Z debug envoy connection [C320] current connecting state: true
2022-10-18T10:04:29.412875Z debug envoy client [C320] connecting
2022-10-18T10:04:29.412877Z debug envoy connection [C320] connecting to 10.244.120.108:17080
2022-10-18T10:04:29.412928Z debug envoy connection [C320] connection in progress
2022-10-18T10:04:29.412939Z debug envoy http [C190][S8764333332205046325] request end stream
2022-10-18T10:04:29.412960Z debug envoy upstream membership update for TLS cluster inbound|17080|| added 1 removed 0
2022-10-18T10:04:29.412965Z debug envoy upstream re-creating local LB for TLS cluster inbound|17080||
2022-10-18T10:04:29.412972Z debug envoy connection [C320] connected
2022-10-18T10:04:29.412975Z debug envoy client [C320] connected
2022-10-18T10:04:29.412979Z debug envoy pool [C320] attaching to next stream
2022-10-18T10:04:29.412981Z debug envoy pool [C320] creating stream
2022-10-18T10:04:29.412988Z debug envoy router [C190][S8764333332205046325] pool ready
2022-10-18T10:04:29.517255Z debug envoy http2 [C320] stream 1 closed: 1
2022-10-18T10:04:29.517291Z debug envoy client [C320] request reset
2022-10-18T10:04:29.517301Z debug envoy pool [C320] destroying stream: 0 remaining
2022-10-18T10:04:29.517318Z debug envoy router [C190][S8764333332205046325] upstream reset: reset reason: remote reset, transport failure reason:
2022-10-18T10:04:29.517366Z debug envoy http [C190][S8764333332205046325] Sending local reply with details upstream_reset_before_response_started{remote_reset}
2022-10-18T10:04:29.517607Z debug envoy http [C190][S8764333332205046325] encoding headers via codec (end_stream=true):
':status', '200'
'content-type', 'application/grpc'
'grpc-status', '14'
'grpc-message', 'upstream connect error or disconnect/reset before headers. reset reason: remote reset'
'x-envoy-peer-metadata', 'ChwKDkFQUF9DT05UQUlORVJTEgoaCHNjLWltY3BzChoKCkNMVVNURVJfSUQSDBoKS3ViZXJuZXRlcwogCgxJTlNUQU5DRV9JUFMSEBoOMTAuMjQ0LjEyMC4xMDgKGQoNSVNUSU9fVkVSU0lPThIIGgYxLjE1LjAKjgIKBkxBQkVMUxKDAiqAAgoRCgNhcHASChoIc2MtaW1jcHMKMQoYY29udHJvbGxlci1yZXZpc2lvbi1oYXNoEhUaE3NjLWltY3BzLTU5Njg0YzY3ODgKJAoZc2VjdXJpdHkuaXN0aW8uaW8vdGxzTW9kZRIHGgVpc3RpbwotCh9zZXJ2aWNlLmlzdGlvLmlvL2Nhbm9uaWNhbC1uYW1lEgoaCHNjLWltY3BzCi8KI3NlcnZpY2UuaXN0aW8uaW8vY2Fub25pY2FsLXJldmlzaW9uEggaBmxhdGVzdAoyCiJzdGF0ZWZ1bHNldC5rdWJlcm5ldGVzLmlvL3BvZC1uYW1lEgwaCnNjLWltY3BzLTAKGgoHTUVTSF9JRBIPGg1jbHVzdGVyLmxvY2FsChQKBE5BTUUSDBoKc2MtaW1jcHMtMAoSCglOQU1FU1BBQ0USBRoDZm9vCkkKBU9XTkVSEkAaPmt1YmVybmV0ZXM6Ly9hcGlzL2FwcHMvdjEvbmFtZXNwYWNlcy9mb28vc3RhdGVmdWxzZXRzL3NjLWltY3BzChcKEVBMQVRGT1JNX01FVEFEQVRBEgIqAAobCg1XT1JLTE9BRF9OQU1FEgoaCHNjLWltY3Bz'
'x-envoy-peer-metadata-id', 'sidecar~10.244.120.108~sc-imcps-0.foo~foo.svc.cluster.local'
'date', 'Tue, 18 Oct 2022 10:04:29 GMT'
'server', 'istio-envoy'
2022-10-18T10:04:29.517689Z debug envoy http2 [C190] stream 3 closed: 0
2022-10-18T10:04:29.517832Z debug envoy wasm wasm log stats_inbound stats_inbound: [extensions/stats/plugin.cc:664]::report() metricKey cache miss istio_response_messages_total , stat=12, recurrent=1
2022-10-18T10:04:29.517843Z debug envoy wasm wasm log stats_inbound stats_inbound: [extensions/stats/plugin.cc:664]::report() metricKey cache miss istio_request_messages_total , stat=16, recurrent=1
2022-10-18T10:04:29.520398Z debug envoy wasm wasm log stats_inbound stats_inbound: [extensions/stats/plugin.cc:664]::report() metricKey cache miss istio_requests_total , stat=24, recurrent=0
2022-10-18T10:04:29.522737Z debug envoy wasm wasm log stats_inbound stats_inbound: [extensions/stats/plugin.cc:664]::report() metricKey cache miss istio_response_bytes , stat=18, recurrent=0
2022-10-18T10:04:29.526875Z debug envoy wasm wasm log stats_inbound stats_inbound: [extensions/stats/plugin.cc:664]::report() metricKey cache miss istio_request_duration_milliseconds , stat=22, recurrent=0
2022-10-18T10:04:29.530799Z debug envoy wasm wasm log stats_inbound stats_inbound: [extensions/stats/plugin.cc:664]::report() metricKey cache miss istio_request_bytes , stat=26, recurrent=0
2022-10-18T10:04:29.553171Z debug envoy http [C190] new stream
2022-10-18T10:04:29.553272Z debug envoy http [C190][S417038132095363947] request headers complete (end_stream=false):
':method', 'POST'
':scheme', 'https'
':path', '/imcps.IMCPS/Init'
':authority', 'sc-imcps-bootstrap-lb.example.com:443'
'content-type', 'application/grpc'
'user-agent', 'grpcurl/v1.8.6 grpc-go/1.44.1-dev'
'te', 'trailers'
'x-forwarded-for', '10.88.0.1'
'x-forwarded-proto', 'https'
'x-envoy-internal', 'true'
'x-request-id', '0d9b8e43-da2e-4f99-bbd8-a5c0c56f799f'
'x-envoy-decorator-operation', 'sc-imcps-bootstrap-svc.foo.svc.cluster.local:17080/imcps.IMCPS/Init*'
'x-envoy-peer-metadata', 'ChQKDkFQUF9DT05UQUlORVJTEgIaAAoaCgpDTFVTVEVSX0lEEgwaCkt1YmVybmV0ZXMKHAoMSU5TVEFOQ0VfSVBTEgwaCjEwLjg4LjAuNTMKGQoNSVNUSU9fVkVSU0lPThIIGgYxLjE1LjAKvwMKBkxBQkVMUxK0AyqxAwodCgNhcHASFhoUaXN0aW8taW5ncmVzc2dhdGV3YXkKEwoFY2hhcnQSChoIZ2F0ZXdheXMKFAoIaGVyaXRhZ2USCBoGVGlsbGVyCjYKKWluc3RhbGwub3BlcmF0b3IuaXN0aW8uaW8vb3duaW5nLXJlc291cmNlEgkaB3Vua25vd24KGQoFaXN0aW8SEBoOaW5ncmVzc2dhdGV3YXkKGQoMaXN0aW8uaW8vcmV2EgkaB2RlZmF1bHQKMAobb3BlcmF0b3IuaXN0aW8uaW8vY29tcG9uZW50EhEaD0luZ3Jlc3NHYXRld2F5cwohChFwb2QtdGVtcGxhdGUtaGFzaBIMGgo1ODVkNjQ1ODU1ChIKB3JlbGVhc2USBxoFaXN0aW8KOQofc2VydmljZS5pc3Rpby5pby9jYW5vbmljYWwtbmFtZRIWGhRpc3Rpby1pbmdyZXNzZ2F0ZXdheQovCiNzZXJ2aWNlLmlzdGlvLmlvL2Nhbm9uaWNhbC1yZXZpc2lvbhIIGgZsYXRlc3QKIgoXc2lkZWNhci5pc3Rpby5pby9pbmplY3QSBxoFZmFsc2UKGgoHTUVTSF9JRBIPGg1jbHVzdGVyLmxvY2FsCi8KBE5BTUUSJxolaXN0aW8taW5ncmVzc2dhdGV3YXktNTg1ZDY0NTg1NS1icmt4NAobCglOQU1FU1BBQ0USDhoMaXN0aW8tc3lzdGVtCl0KBU9XTkVSElQaUmt1YmVybmV0ZXM6Ly9hcGlzL2FwcHMvdjEvbmFtZXNwYWNlcy9pc3Rpby1zeXN0ZW0vZGVwbG95bWVudHMvaXN0aW8taW5ncmVzc2dhdGV3YXkKFwoRUExBVEZPUk1fTUVUQURBVEESAioACicKDVdPUktMT0FEX05BTUUSFhoUaXN0aW8taW5ncmVzc2dhdGV3YXk='
'x-envoy-peer-metadata-id', 'router~10.88.0.53~istio-ingressgateway-585d645855-brkx4.istio-system~istio-system.svc.cluster.local'
'x-envoy-attempt-count', '2'
'x-b3-traceid', '17b50b6247fe2fcbbc2b2057ef4db96d'
'x-b3-spanid', 'bc2b2057ef4db96d'
'x-b3-sampled', '0'
2022-10-18T10:04:29.553290Z debug envoy connection [C190] current connecting state: false
2022-10-18T10:04:29.553412Z debug envoy router [C190][S417038132095363947] cluster 'inbound|17080||' match for URL '/imcps.IMCPS/Init'
2022-10-18T10:04:29.553445Z debug envoy upstream Using existing host 10.244.120.108:17080.
2022-10-18T10:04:29.553462Z debug envoy router [C190][S417038132095363947] router decoding headers:
':method', 'POST'
':scheme', 'https'
':path', '/imcps.IMCPS/Init'
':authority', 'sc-imcps-bootstrap-lb.example.com:443'
'content-type', 'application/grpc'
'user-agent', 'grpcurl/v1.8.6 grpc-go/1.44.1-dev'
'te', 'trailers'
'x-forwarded-for', '10.88.0.1'
'x-forwarded-proto', 'https'
'x-request-id', '0d9b8e43-da2e-4f99-bbd8-a5c0c56f799f'
'x-envoy-attempt-count', '2'
'x-b3-traceid', '17b50b6247fe2fcbbc2b2057ef4db96d'
'x-b3-spanid', 'bc2b2057ef4db96d'
'x-b3-sampled', '0'
'x-envoy-internal', 'true'
'x-forwarded-client-cert', 'By=spiffe://cluster.local/ns/foo/sa/default;Hash=dda6034f03e05bbb9d0183b80583ee9b5842670599dd86827c8f8b6a74060fa0;Subject="";URI=spiffe://cluster.local/ns/istio-system/sa/istio-ingressgateway-service-account'
2022-10-18T10:04:29.553473Z debug envoy pool [C320] using existing fully connected connection
2022-10-18T10:04:29.553477Z debug envoy pool [C320] creating stream
2022-10-18T10:04:29.553487Z debug envoy router [C190][S417038132095363947] pool ready
2022-10-18T10:04:29.553519Z debug envoy http [C190][S417038132095363947] request end stream
2022-10-18T10:04:29.554585Z debug envoy http2 [C320] stream 3 closed: 1
2022-10-18T10:04:29.554607Z debug envoy client [C320] request reset
2022-10-18T10:04:29.554616Z debug envoy pool [C320] destroying stream: 0 remaining
2022-10-18T10:04:29.554631Z debug envoy router [C190][S417038132095363947] upstream reset: reset reason: remote reset, transport failure reason:
2022-10-18T10:04:29.554671Z debug envoy http [C190][S417038132095363947] Sending local reply with details upstream_reset_before_response_started{remote_reset}
2022-10-18T10:04:29.554756Z debug envoy http [C190][S417038132095363947] encoding headers via codec (end_stream=true):
':status', '200'
'content-type', 'application/grpc'
'grpc-status', '14'
'grpc-message', 'upstream connect error or disconnect/reset before headers. reset reason: remote reset'
'x-envoy-peer-metadata', 'ChwKDkFQUF9DT05UQUlORVJTEgoaCHNjLWltY3BzChoKCkNMVVNURVJfSUQSDBoKS3ViZXJuZXRlcwogCgxJTlNUQU5DRV9JUFMSEBoOMTAuMjQ0LjEyMC4xMDgKGQoNSVNUSU9fVkVSU0lPThIIGgYxLjE1LjAKjgIKBkxBQkVMUxKDAiqAAgoRCgNhcHASChoIc2MtaW1jcHMKMQoYY29udHJvbGxlci1yZXZpc2lvbi1oYXNoEhUaE3NjLWltY3BzLTU5Njg0YzY3ODgKJAoZc2VjdXJpdHkuaXN0aW8uaW8vdGxzTW9kZRIHGgVpc3RpbwotCh9zZXJ2aWNlLmlzdGlvLmlvL2Nhbm9uaWNhbC1uYW1lEgoaCHNjLWltY3BzCi8KI3NlcnZpY2UuaXN0aW8uaW8vY2Fub25pY2FsLXJldmlzaW9uEggaBmxhdGVzdAoyCiJzdGF0ZWZ1bHNldC5rdWJlcm5ldGVzLmlvL3BvZC1uYW1lEgwaCnNjLWltY3BzLTAKGgoHTUVTSF9JRBIPGg1jbHVzdGVyLmxvY2FsChQKBE5BTUUSDBoKc2MtaW1jcHMtMAoSCglOQU1FU1BBQ0USBRoDZm9vCkkKBU9XTkVSEkAaPmt1YmVybmV0ZXM6Ly9hcGlzL2FwcHMvdjEvbmFtZXNwYWNlcy9mb28vc3RhdGVmdWxzZXRzL3NjLWltY3BzChcKEVBMQVRGT1JNX01FVEFEQVRBEgIqAAobCg1XT1JLTE9BRF9OQU1FEgoaCHNjLWltY3Bz'
'x-envoy-peer-metadata-id', 'sidecar~10.244.120.108~sc-imcps-0.foo~foo.svc.cluster.local'
'date', 'Tue, 18 Oct 2022 10:04:29 GMT'
'server', 'istio-envoy'
2022-10-18T10:04:29.554788Z debug envoy http2 [C190] stream 5 closed: 0
2022-10-18T10:04:29.554893Z debug envoy wasm wasm log stats_inbound stats_inbound: [extensions/stats/plugin.cc:645]::report() metricKey cache hit , stat=12
2022-10-18T10:04:29.554903Z debug envoy wasm wasm log stats_inbound stats_inbound: [extensions/stats/plugin.cc:645]::report() metricKey cache hit , stat=16
2022-10-18T10:04:29.554905Z debug envoy wasm wasm log stats_inbound stats_inbound: [extensions/stats/plugin.cc:645]::report() metricKey cache hit , stat=24
2022-10-18T10:04:29.554914Z debug envoy wasm wasm log stats_inbound stats_inbound: [extensions/stats/plugin.cc:645]::report() metricKey cache hit , stat=18
2022-10-18T10:04:29.554917Z debug envoy wasm wasm log stats_inbound stats_inbound: [extensions/stats/plugin.cc:645]::report() metricKey cache hit , stat=22
2022-10-18T10:04:29.554919Z debug envoy wasm wasm log stats_inbound stats_inbound: [extensions/stats/plugin.cc:645]::report() metricKey cache hit , stat=26
2022-10-18T10:04:29.561521Z debug envoy http [C190] new stream
2022-10-18T10:04:29.561614Z debug envoy http [C190][S7465002415732961759] request headers complete (end_stream=false):
':method', 'POST'
':scheme', 'https'
':path', '/imcps.IMCPS/Init'
':authority', 'sc-imcps-bootstrap-lb.example.com:443'
'content-type', 'application/grpc'
'user-agent', 'grpcurl/v1.8.6 grpc-go/1.44.1-dev'
'te', 'trailers'
'x-forwarded-for', '10.88.0.1'
'x-forwarded-proto', 'https'
'x-envoy-internal', 'true'
'x-request-id', '0d9b8e43-da2e-4f99-bbd8-a5c0c56f799f'
'x-envoy-decorator-operation', 'sc-imcps-bootstrap-svc.foo.svc.cluster.local:17080/imcps.IMCPS/Init*'
'x-envoy-peer-metadata', 'ChQKDkFQUF9DT05UQUlORVJTEgIaAAoaCgpDTFVTVEVSX0lEEgwaCkt1YmVybmV0ZXMKHAoMSU5TVEFOQ0VfSVBTEgwaCjEwLjg4LjAuNTMKGQoNSVNUSU9fVkVSU0lPThIIGgYxLjE1LjAKvwMKBkxBQkVMUxK0AyqxAwodCgNhcHASFhoUaXN0aW8taW5ncmVzc2dhdGV3YXkKEwoFY2hhcnQSChoIZ2F0ZXdheXMKFAoIaGVyaXRhZ2USCBoGVGlsbGVyCjYKKWluc3RhbGwub3BlcmF0b3IuaXN0aW8uaW8vb3duaW5nLXJlc291cmNlEgkaB3Vua25vd24KGQoFaXN0aW8SEBoOaW5ncmVzc2dhdGV3YXkKGQoMaXN0aW8uaW8vcmV2EgkaB2RlZmF1bHQKMAobb3BlcmF0b3IuaXN0aW8uaW8vY29tcG9uZW50EhEaD0luZ3Jlc3NHYXRld2F5cwohChFwb2QtdGVtcGxhdGUtaGFzaBIMGgo1ODVkNjQ1ODU1ChIKB3JlbGVhc2USBxoFaXN0aW8KOQofc2VydmljZS5pc3Rpby5pby9jYW5vbmljYWwtbmFtZRIWGhRpc3Rpby1pbmdyZXNzZ2F0ZXdheQovCiNzZXJ2aWNlLmlzdGlvLmlvL2Nhbm9uaWNhbC1yZXZpc2lvbhIIGgZsYXRlc3QKIgoXc2lkZWNhci5pc3Rpby5pby9pbmplY3QSBxoFZmFsc2UKGgoHTUVTSF9JRBIPGg1jbHVzdGVyLmxvY2FsCi8KBE5BTUUSJxolaXN0aW8taW5ncmVzc2dhdGV3YXktNTg1ZDY0NTg1NS1icmt4NAobCglOQU1FU1BBQ0USDhoMaXN0aW8tc3lzdGVtCl0KBU9XTkVSElQaUmt1YmVybmV0ZXM6Ly9hcGlzL2FwcHMvdjEvbmFtZXNwYWNlcy9pc3Rpby1zeXN0ZW0vZGVwbG95bWVudHMvaXN0aW8taW5ncmVzc2dhdGV3YXkKFwoRUExBVEZPUk1fTUVUQURBVEESAioACicKDVdPUktMT0FEX05BTUUSFhoUaXN0aW8taW5ncmVzc2dhdGV3YXk='
'x-envoy-peer-metadata-id', 'router~10.88.0.53~istio-ingressgateway-585d645855-brkx4.istio-system~istio-system.svc.cluster.local'
'x-envoy-attempt-count', '3'
'x-b3-traceid', '17b50b6247fe2fcbbc2b2057ef4db96d'
'x-b3-spanid', 'bc2b2057ef4db96d'
'x-b3-sampled', '0'
2022-10-18T10:04:29.561647Z debug envoy connection [C190] current connecting state: false
2022-10-18T10:04:29.561750Z debug envoy router [C190][S7465002415732961759] cluster 'inbound|17080||' match for URL '/imcps.IMCPS/Init'
2022-10-18T10:04:29.561796Z debug envoy upstream Using existing host 10.244.120.108:17080.
2022-10-18T10:04:29.561825Z debug envoy router [C190][S7465002415732961759] router decoding headers:
':method', 'POST'
':scheme', 'https'
':path', '/imcps.IMCPS/Init'
':authority', 'sc-imcps-bootstrap-lb.example.com:443'
'content-type', 'application/grpc'
'user-agent', 'grpcurl/v1.8.6 grpc-go/1.44.1-dev'
'te', 'trailers'
'x-forwarded-for', '10.88.0.1'
'x-forwarded-proto', 'https'
'x-request-id', '0d9b8e43-da2e-4f99-bbd8-a5c0c56f799f'
'x-envoy-attempt-count', '3'
'x-b3-traceid', '17b50b6247fe2fcbbc2b2057ef4db96d'
'x-b3-spanid', 'bc2b2057ef4db96d'
'x-b3-sampled', '0'
'x-envoy-internal', 'true'
'x-forwarded-client-cert', 'By=spiffe://cluster.local/ns/foo/sa/default;Hash=dda6034f03e05bbb9d0183b80583ee9b5842670599dd86827c8f8b6a74060fa0;Subject="";URI=spiffe://cluster.local/ns/istio-system/sa/istio-ingressgateway-service-account'
2022-10-18T10:04:29.561841Z debug envoy pool [C320] using existing fully connected connection
2022-10-18T10:04:29.561844Z debug envoy pool [C320] creating stream
2022-10-18T10:04:29.561850Z debug envoy router [C190][S7465002415732961759] pool ready
2022-10-18T10:04:29.561877Z debug envoy http [C190][S7465002415732961759] request end stream
2022-10-18T10:04:29.616003Z debug envoy http2 [C320] stream 5 closed: 1
2022-10-18T10:04:29.616037Z debug envoy client [C320] request reset
2022-10-18T10:04:29.616045Z debug envoy pool [C320] destroying stream: 0 remaining
2022-10-18T10:04:29.616057Z debug envoy router [C190][S7465002415732961759] upstream reset: reset reason: remote reset, transport failure reason:
2022-10-18T10:04:29.616083Z debug envoy http [C190][S7465002415732961759] Sending local reply with details upstream_reset_before_response_started{remote_reset}
2022-10-18T10:04:29.616133Z debug envoy http [C190][S7465002415732961759] encoding headers via codec (end_stream=true):
':status', '200'
'content-type', 'application/grpc'
'grpc-status', '14'
'grpc-message', 'upstream connect error or disconnect/reset before headers. reset reason: remote reset'
'x-envoy-peer-metadata', 'ChwKDkFQUF9DT05UQUlORVJTEgoaCHNjLWltY3BzChoKCkNMVVNURVJfSUQSDBoKS3ViZXJuZXRlcwogCgxJTlNUQU5DRV9JUFMSEBoOMTAuMjQ0LjEyMC4xMDgKGQoNSVNUSU9fVkVSU0lPThIIGgYxLjE1LjAKjgIKBkxBQkVMUxKDAiqAAgoRCgNhcHASChoIc2MtaW1jcHMKMQoYY29udHJvbGxlci1yZXZpc2lvbi1oYXNoEhUaE3NjLWltY3BzLTU5Njg0YzY3ODgKJAoZc2VjdXJpdHkuaXN0aW8uaW8vdGxzTW9kZRIHGgVpc3RpbwotCh9zZXJ2aWNlLmlzdGlvLmlvL2Nhbm9uaWNhbC1uYW1lEgoaCHNjLWltY3BzCi8KI3NlcnZpY2UuaXN0aW8uaW8vY2Fub25pY2FsLXJldmlzaW9uEggaBmxhdGVzdAoyCiJzdGF0ZWZ1bHNldC5rdWJlcm5ldGVzLmlvL3BvZC1uYW1lEgwaCnNjLWltY3BzLTAKGgoHTUVTSF9JRBIPGg1jbHVzdGVyLmxvY2FsChQKBE5BTUUSDBoKc2MtaW1jcHMtMAoSCglOQU1FU1BBQ0USBRoDZm9vCkkKBU9XTkVSEkAaPmt1YmVybmV0ZXM6Ly9hcGlzL2FwcHMvdjEvbmFtZXNwYWNlcy9mb28vc3RhdGVmdWxzZXRzL3NjLWltY3BzChcKEVBMQVRGT1JNX01FVEFEQVRBEgIqAAobCg1XT1JLTE9BRF9OQU1FEgoaCHNjLWltY3Bz'
'x-envoy-peer-metadata-id', 'sidecar~10.244.120.108~sc-imcps-0.foo~foo.svc.cluster.local'
'date', 'Tue, 18 Oct 2022 10:04:29 GMT'
'server', 'istio-envoy'
2022-10-18T10:04:29.616158Z debug envoy http2 [C190] stream 7 closed: 0
2022-10-18T10:04:29.616256Z debug envoy wasm wasm log stats_inbound stats_inbound: [extensions/stats/plugin.cc:645]::report() metricKey cache hit , stat=12
2022-10-18T10:04:29.616265Z debug envoy wasm wasm log stats_inbound stats_inbound: [extensions/stats/plugin.cc:645]::report() metricKey cache hit , stat=16
2022-10-18T10:04:29.616267Z debug envoy wasm wasm log stats_inbound stats_inbound: [extensions/stats/plugin.cc:645]::report() metricKey cache hit , stat=24
2022-10-18T10:04:29.616270Z debug envoy wasm wasm log stats_inbound stats_inbound: [extensions/stats/plugin.cc:645]::report() metricKey cache hit , stat=18
2022-10-18T10:04:29.616272Z debug envoy wasm wasm log stats_inbound stats_inbound: [extensions/stats/plugin.cc:645]::report() metricKey cache hit , stat=22
2022-10-18T10:04:29.616274Z debug envoy wasm wasm log stats_inbound stats_inbound: [extensions/stats/plugin.cc:645]::report() metricKey cache hit , stat=26
2022-10-18T10:04:29.664070Z debug envoy conn_handler [C321] new connection from 192.168.1.13:40686
PS : We have successfully implemented SIMPLE and MUTUAL TLS for REST Services.
Any help will be very much appreciated? I am stuck here! Eventually, after this, we will need to setup mTLS.
Thanks in advance.
We have been using gRPC server with .NET 6. And gRPC kestrel .NET 6 gRPC server is running in k8s under http transport, a minikube load balancer terminates SSL and sends request to the app with :scheme pseudo-header set to "https", but the actual transport is "http" results in this error. Here is the issue. Also find the discussions here thread-1 and thread-2,
For my case, the solution is to add following Kestrel Configuration -
webBuilder.UseKestrel(opts =>
{
opts.AllowAlternateSchemes = true;
});
I am trying to connect to a Spark cluster using sparklyr on yarn-client mode.
On local mode (master = "local") my spark setup works, but when I try to connect to the Cluster, I get the following error
Error in force(code) :
Failed during initialize_connection: java.lang.NoClassDefFoundError: com/sun/jersey/api/client/config/ClientConfig
(see full error log below)
The setup is as follows. The spark cluster (hosted on AWS), setup with Ambari, runs on yarn 3.1.1, spark 2.3.2, hdfs 3.1.1, and some other services and works with other platforms (i.e., non R/Python applications setup with Ambari. Note that a setup using Ambari is not possible, as the R machine runs on Ubuntu, and the Spark cluster on CentOS 7).
On my R machine I use the following code. Note that I have installed java 8-openjdk and the correct spark version.
Inside of my YARN_CONF_DIR I have created the yarn-site.xml file, as exported from Ambari (Services -> Download All Client Configs). I have also tried to copy the files hdfs-site.xml and hive-site.xml with the same result.
library(sparklyr)
library(DBI)
# spark_install("2.3.2")
spark_installed_versions()
#> spark hadoop dir
#> 1 2.3.2 2.7 /home/david/spark/spark-2.3.2-bin-hadoop2.7
# use java 8 instead of java 11 (not supported with Spark 2.3.2 only 3.0.0+)
Sys.setenv(JAVA_HOME = "/usr/lib/jvm/java-8-openjdk-amd64/")
Sys.setenv(SPARK_HOME = "/home/david/spark/spark-2.3.2-bin-hadoop2.7/")
Sys.setenv(YARN_CONF_DIR = "/home/david/Spark-test/yarn-conf")
conf <- spark_config()
conf$spark.executor.memory <- "500M"
conf$spark.executor.cores <- 2
conf$spark.executor.instances <- 1
conf$spark.dynamicAllocation.enabled <- "false"
sc <- spark_connect(master = "yarn-client", config = conf)
#> Error in force(code) :
#> Failed during initialize_connection: java.lang.NoClassDefFoundError: com/sun/jersey/api/client/config/ClientConfig
#> ...
I am not really sure how to debug this, on which machine the error originates, or how to fix it, thus any help and or hint is greatly appreciated!
Edit / Progress
So far I have found out, that the spark version installed by sparklyr (from here), depends on glassfish, whereas my cluster depends on an oracle java installation (hence the com/sun/... path).
This applies to the following java packages:
library(tidyverse)
library(glue)
ll <- list.files("~/spark/spark-2.3.2-bin-hadoop2.7/jars/", pattern = "^jersey", full.names = TRUE)
df <- map_dfr(ll, function(f) {
x <- system(glue("jar tvf {f}"), intern = TRUE)
tibble(file = f, class = str_extract(x, "[^ ]+$"))
})
df %>%
filter(str_detect(class, "com/sun")) %>%
count(file)
#> # A tibble: 4 x 2
#> file n
#> <chr> <int>
#> 1 /home/david/spark/spark-2.3.2-bin-hadoop2.7/jars//activation-1.1.1.jar 15
#> 2 /home/david/spark/spark-2.3.2-bin-hadoop2.7/jars//derby.log 1194
#> 3 /home/david/spark/spark-2.3.2-bin-hadoop2.7/jars//jersey-client-1.19.jar 108
#> 4 /home/david/spark/spark-2.3.2-bin-hadoop2.7/jars//jersey-server-2.22.2.jar 22
I have tried to load the latest jar files from maven (e.g., from this) for the files jersey-client.jar and jersey-core.jar and now the connection takes ages and does not finish (at least not the same error anymore, Yay I guess...). Any idea what the cause of this issue is?
Full Error log
Error in force(code) :
Failed during initialize_connection: java.lang.NoClassDefFoundError: com/sun/jersey/api/client/config/ClientConfig
at org.apache.hadoop.yarn.client.api.TimelineClient.createTimelineClient(TimelineClient.java:55)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.createTimelineClient(YarnClientImpl.java:181)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceInit(YarnClientImpl.java:168)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:151)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:164)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:500)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2493)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:934)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:925)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:925)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at sparklyr.Invoke.invoke(invoke.scala:147)
at sparklyr.StreamHandler.handleMethodCall(stream.scala:136)
at sparklyr.StreamHandler.read(stream.scala:61)
at sparklyr.BackendHandler$$anonfun$channelRead0$1.apply$mcV$sp(handler.scala:58)
at scala.util.control.Breaks.breakable(Breaks.scala:38)
at sparklyr.BackendHandler.channelRead0(handler.scala:38)
at sparklyr.BackendHandler.channelRead0(handler.scala:14)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:284)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:138)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: com.sun.jersey.api.client.config.ClientConfig
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 49 more
Log: /tmp/RtmpIKnflg/filee462cec58ee_spark.log
---- Output Log ----
20/07/16 10:20:42 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/07/16 10:20:42 INFO sparklyr: Session (3779) is starting under 127.0.0.1 port 8880
20/07/16 10:20:42 INFO sparklyr: Session (3779) found port 8880 is not available
20/07/16 10:20:42 INFO sparklyr: Backend (3779) found port 8884 is available
20/07/16 10:20:42 INFO sparklyr: Backend (3779) is registering session in gateway
20/07/16 10:20:42 INFO sparklyr: Backend (3779) is waiting for registration in gateway
20/07/16 10:20:42 INFO sparklyr: Backend (3779) finished registration in gateway with status 0
20/07/16 10:20:42 INFO sparklyr: Backend (3779) is waiting for sparklyr client to connect to port 8884
20/07/16 10:20:43 INFO sparklyr: Backend (3779) accepted connection
20/07/16 10:20:43 INFO sparklyr: Backend (3779) is waiting for sparklyr client to connect to port 8884
20/07/16 10:20:43 INFO sparklyr: Backend (3779) received command 0
20/07/16 10:20:43 INFO sparklyr: Backend (3779) found requested session matches current session
20/07/16 10:20:43 INFO sparklyr: Backend (3779) is creating backend and allocating system resources
20/07/16 10:20:43 INFO sparklyr: Backend (3779) is using port 8885 for backend channel
20/07/16 10:20:43 INFO sparklyr: Backend (3779) created the backend
20/07/16 10:20:43 INFO sparklyr: Backend (3779) is waiting for r process to end
20/07/16 10:20:43 INFO SparkContext: Running Spark version 2.3.2
20/07/16 10:20:43 WARN SparkConf: spark.master yarn-client is deprecated in Spark 2.0+, please instead use "yarn" with specified deploy mode.
20/07/16 10:20:43 INFO SparkContext: Submitted application: sparklyr
20/07/16 10:20:43 INFO SecurityManager: Changing view acls to: ubuntu
20/07/16 10:20:43 INFO SecurityManager: Changing modify acls to: ubuntu
20/07/16 10:20:43 INFO SecurityManager: Changing view acls groups to:
20/07/16 10:20:43 INFO SecurityManager: Changing modify acls groups to:
20/07/16 10:20:43 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(ubuntu); groups with view permissions: Set(); users with modify permissions: Set(ubuntu); groups with modify permissions: Set()
20/07/16 10:20:43 INFO Utils: Successfully started service 'sparkDriver' on port 42419.
20/07/16 10:20:43 INFO SparkEnv: Registering MapOutputTracker
20/07/16 10:20:43 INFO SparkEnv: Registering BlockManagerMaster
20/07/16 10:20:43 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
20/07/16 10:20:43 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
20/07/16 10:20:43 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-583db378-821a-4990-bfd2-5fcaf95d071b
20/07/16 10:20:44 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
20/07/16 10:20:44 INFO SparkEnv: Registering OutputCommitCoordinator
20/07/16 10:20:44 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
20/07/16 10:20:44 INFO Utils: Successfully started service 'SparkUI' on port 4041.
20/07/16 10:20:44 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://{SPARK IP}
Then in the /tmp/RtmpIKnflg/filee462cec58ee_spark.log file
20/07/16 10:09:07 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/07/16 10:09:07 INFO sparklyr: Session (11296) is starting under 127.0.0.1 port 8880
20/07/16 10:09:07 INFO sparklyr: Session (11296) found port 8880 is not available
20/07/16 10:09:07 INFO sparklyr: Backend (11296) found port 8882 is available
20/07/16 10:09:07 INFO sparklyr: Backend (11296) is registering session in gateway
20/07/16 10:09:07 INFO sparklyr: Backend (11296) is waiting for registration in gateway
20/07/16 10:09:07 INFO sparklyr: Backend (11296) finished registration in gateway with status 0
20/07/16 10:09:07 INFO sparklyr: Backend (11296) is waiting for sparklyr client to connect to port 8882
20/07/16 10:09:07 INFO sparklyr: Backend (11296) accepted connection
20/07/16 10:09:07 INFO sparklyr: Backend (11296) is waiting for sparklyr client to connect to port 8882
20/07/16 10:09:07 INFO sparklyr: Backend (11296) received command 0
20/07/16 10:09:07 INFO sparklyr: Backend (11296) found requested session matches current session
20/07/16 10:09:07 INFO sparklyr: Backend (11296) is creating backend and allocating system resources
20/07/16 10:09:07 INFO sparklyr: Backend (11296) is using port 8883 for backend channel
20/07/16 10:09:07 INFO sparklyr: Backend (11296) created the backend
20/07/16 10:09:07 INFO sparklyr: Backend (11296) is waiting for r process to end
20/07/16 10:09:08 INFO SparkContext: Running Spark version 2.3.2
20/07/16 10:09:08 WARN SparkConf: spark.master yarn-client is deprecated in Spark 2.0+, please instead use "yarn" with specified deploy mode.
20/07/16 10:09:08 INFO SparkContext: Submitted application: sparklyr
20/07/16 10:09:08 INFO SecurityManager: Changing view acls to: david
20/07/16 10:09:08 INFO SecurityManager: Changing modify acls to: david
20/07/16 10:09:08 INFO SecurityManager: Changing view acls groups to:
20/07/16 10:09:08 INFO SecurityManager: Changing modify acls groups to:
20/07/16 10:09:08 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(david); groups with view permissions: Set(); users with modify permissions: Set(david); groups with modify permissions: Set()
20/07/16 10:09:08 INFO Utils: Successfully started service 'sparkDriver' on port 44541.
20/07/16 10:09:08 INFO SparkEnv: Registering MapOutputTracker
20/07/16 10:09:08 INFO SparkEnv: Registering BlockManagerMaster
20/07/16 10:09:08 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
20/07/16 10:09:08 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
20/07/16 10:09:08 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-d7b67ab2-508c-4488-ac1b-7ee0e787aa79
20/07/16 10:09:08 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
20/07/16 10:09:08 INFO SparkEnv: Registering OutputCommitCoordinator
20/07/16 10:09:08 INFO Utils: Successfully started service 'SparkUI' on port 4040.
20/07/16 10:09:08 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://{THE INTERNAL SPARK IP}:4040
20/07/16 10:09:08 INFO SparkContext: Added JAR file:/home/david/R/x86_64-pc-linux-gnu-library/4.0/sparklyr/java/sparklyr-2.3-2.11.jar at spark://{THE INTERNAL SPARK IP}:44541/jars/sparklyr-2.3-2.11.jar with timestamp 1594894148685
20/07/16 10:09:09 ERROR sparklyr: Backend (11296) failed calling getOrCreate on 11: java.lang.NoClassDefFoundError: com/sun/jersey/api/client/config/ClientConfig
at org.apache.hadoop.yarn.client.api.TimelineClient.createTimelineClient(TimelineClient.java:55)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.createTimelineClient(YarnClientImpl.java:181)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceInit(YarnClientImpl.java:168)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:151)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:164)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:500)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2493)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:934)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:925)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:925)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at sparklyr.Invoke.invoke(invoke.scala:147)
at sparklyr.StreamHandler.handleMethodCall(stream.scala:136)
at sparklyr.StreamHandler.read(stream.scala:61)
at sparklyr.BackendHandler$$anonfun$channelRead0$1.apply$mcV$sp(handler.scala:58)
at scala.util.control.Breaks.breakable(Breaks.scala:38)
at sparklyr.BackendHandler.channelRead0(handler.scala:38)
at sparklyr.BackendHandler.channelRead0(handler.scala:14)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:284)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:138)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: com.sun.jersey.api.client.config.ClientConfig
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 49 more
I am using Cloudify 2.7 with OpenStack Icehouse. In particularly, I have configured the cloud driver to bootstrap 2 Management VMs (numberOfManagementMachines 2).
Sometime, when I bootstrap the VMs I receive the following error:
cloudify#default> bootstrap-cloud --verbose openstack-icehouse-<project_name>
...
Starting agent and management processes:
[VM_Floating_IP] nohup gs-agent.sh gsa.global.lus 0 gsa.lus 1 gsa.gsc 0 gsa.global.gsm 0 gsa.gsm 1 gsa.global.esm 1 >/dev/null 2>&1
[VM_Floating_IP] STARTING CLOUDIFY MANAGEMENT
[VM_Floating_IP] .
[VM_Floating_IP] Discovered agent nic-address=177.86.0.3 lookup-groups=gigaspaces-Cloudify-2.7.1-ga.
[VM_Floating_IP] Detected LUS management process started by agent null expected agent a0eec4e5-7fb0-4428-80e1-ec13a8b1c744
[VM_Floating_IP] Detected LUS management process started by agent a0eec4e5-7fb0-4428-80e1-ec13a8b1c744
[VM_Floating_IP] Detected GSM management process started by agent a0eec4e5-7fb0-4428-80e1-ec13a8b1c744
[VM_Floating_IP] Waiting for Management processes to start.
[VM_Floating_IP] Waiting for Elastic Service Manager
[VM_Floating_IP] Waiting for Management processes to start.
[VM_Floating_IP] .
[VM_Floating_IP] Waiting for Elastic Service Manager
[VM_Floating_IP] Waiting for Management processes to start.
[VM_Floating_IP] .
[VM_Floating_IP] Waiting for Elastic Service Manager
[VM_Floating_IP] Waiting for Management processes to start.
[VM_Floating_IP] .
[VM_Floating_IP] Waiting for Elastic Service Manager
[VM_Floating_IP] Waiting for Management processes to start.
[VM_Floating_IP] .failure occurred while renewing an event lease: Operation failed. net.jini.core.lease.UnknownLeaseException: Unknown event id: 3
[VM_Floating_IP] at com.sun.jini.reggie.GigaRegistrar.renewEventLeaseInt(GigaRegistrar.java:5494)
[VM_Floating_IP] at com.sun.jini.reggie.GigaRegistrar.renewEventLeaseDo(GigaRegistrar.java:5475)
[VM_Floating_IP] at com.sun.jini.reggie.GigaRegistrar.renewEventLease(GigaRegistrar.java:2836)
[VM_Floating_IP] at com.sun.jini.reggie.RegistrarGigaspacesMethodinternalInvoke16.internalInvoke(Unknown Source)
[VM_Floating_IP] at com.gigaspaces.internal.reflection.fast.AbstractMethod.invoke(AbstractMethod.java:41)
[VM_Floating_IP] at com.gigaspaces.lrmi.LRMIRuntime.invoked(LRMIRuntime.java:464)
[VM_Floating_IP] at com.gigaspaces.lrmi.nio.Pivot.consumeAndHandleRequest(Pivot.java:561)
[VM_Floating_IP] at com.gigaspaces.lrmi.nio.Pivot.handleRequest(Pivot.java:662)
[VM_Floating_IP] at com.gigaspaces.lrmi.nio.Pivot$ChannelEntryTask.run(Pivot.java:196)
[VM_Floating_IP] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
[VM_Floating_IP] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
[VM_Floating_IP] at java.lang.Thread.run(Thread.java:662)
[VM_Floating_IP]
[VM_Floating_IP]
[VM_Floating_IP] Waiting for Elastic Service Manager
[VM_Floating_IP] Waiting for Management processes to start.
....
[VM_Floating_IP] ....Failed to add [Processing Unit Instance] with uid [8038e956-1ae2-4378-8bb1-e2055202c160]: Operation failed. java.rmi.ConnectException: Connect Failed to [NIO://177.86.0.3:7011/pid[4390]/164914896032_3_8060218823096628119_details[class org.openspaces.pu.container.servicegrid.PUServiceBeanImpl]]; nested exception is:
[VM_Floating_IP] java.net.SocketTimeoutException
...
[VM_Floating_IP] Failed to add [GSM] with uid [3c0e20e9-bf85-4d22-8ed6-3b387e690878]: Operation failed. java.rmi.ConnectException: Connect Failed to [NIO://177.86.0.3:7000/pid[4229]/154704895271_2_2245795805687723285_details[class com.gigaspaces.grid.gsm.GSMImpl]]; nested exception is:
[VM_Floating_IP] java.net.SocketTimeoutException
...
[VM_Floating_IP] Failed to add GSC with uid [8070dabb-d80d-43c7-bd9c-1d2478f95710]: Operation failed. java.rmi.ConnectException: Connect Failed to [NIO://177.86.0.3:7011/pid[4390]/164914896020_2_8060218823096628119_details[class com.gigaspaces.grid.gsc.GSCImpl]]; nested exception is:
[VM_Floating_IP] java.net.SocketTimeoutException
...
[VM_Floating_IP] Failed to add [GSA] with uid [a0eec4e5-7fb0-4428-80e1-ec13a8b1c744]: Operation failed. java.rmi.ConnectException: Connect Failed to [NIO://177.86.0.3:7002/pid[4086]/153569177936_2_8701370873164361474_details[class com.gigaspaces.grid.gsa.GSAImpl]]; nested exception is:
[VM_Floating_IP] java.net.SocketTimeoutException
...
[VM_Floating_IP] Waiting for Management processes to start.
[VM_Floating_IP] Failed to connect to LUS on 177.86.0.3:4174, retry in 73096ms: Operation failed. java.net.ConnectException: Connection timed out
...
[VM_Floating_IP] .Failed to add [ESM] with uid [996c8898-897c-4416-a877-82efb22c7ea6]: Operation failed. java.rmi.ConnectException: Connect Failed to [NIO://177.86.0.3:7003/pid[4504]/172954418920_2_5475350805758957057_details[class org.openspaces.grid.esm.ESMImpl]]; nested exception is:
[VM_Floating_IP] java.net.SocketTimeoutException
Can someone suggest to me any solution? Should I have to configure any timeout value?
Thanks.
------------------------Edited-------------------
I would add some information.
Each manager instance has 4VCPUs, 8GB RAM, 20GB Disk.
Each manager instance has the Security Groups created by Cloudify, that is:
cloudify-manager-cluster
Egress IPv4 Any - 0.0.0.0/0 (CIDR)
Egress IPv6 Any - ::/0 (CIDR)
cloudify-manager-management
Egress IPv4 Any - 0.0.0.0/0 (CIDR)
Egress IPv6 Any - ::/0 (CIDR)
Ingress IPv4 TCP 22 0.0.0.0/0 (CIDR)
Ingress IPv4 TCP 4174 cfy-mngt-cluster
Ingress IPv4 TCP 6666 cfy-mngt-cluster
Ingress IPv4 TCP 7000 cfy-mngt-cluster
Ingress IPv4 TCP 7001 cfy-mngt-cluster
Ingress IPv4 TCP 7002 cfy-mngt-cluster
Ingress IPv4 TCP 7003 cfy-mngt-cluster
Ingress IPv4 TCP 7010 - 7110 cfy-mngt-cluster
Ingress IPv4 TCP 8099 0.0.0.0/0 (CIDR)
Ingress IPv4 TCP 8100 0.0.0.0/0 (CIDR)
Moreover, Cloudify creates a private net "cloudify-manager-Cloudify-Management-Network" with subnet 177.86.0.0/24, and for each VM it asks for a Floating IP.
The ESM is Cloudify's Orchestrator. Only one instance of it should be running at any one time. The error indicates that the boostrap process was expecting to find a running ESM, but did not find one. This seems to be related to communication errors between the manager instances - is it possible that the security groups defined for the manager do not open all ports between the managers?
Security group/firewall configurations are the usual problem. It is also possible that the manager VM is too small - it should have at-least 4 GB Ram and 2 vCPUs.
Please keep in mind that Cloudify 2.X has reached end-of-life and is no longer supported. You may want to check out Cloudify 3.