installed puppet on a utility node - unix

I'm running a version 6 puppet on a utility node and when I try to connect to the puppet master from the puppet agent I get this error.
[root#utility ~]# puppet agent --test
Warning: Unable to fetch my node definition, but the agent run will continue:
Warning: SSL_connect returned=1 errno=0 state=error: certificate verify failed: [unable to get certificate CRL for /CN=utility.example.com]
Info: Retrieving pluginfacts
Error: /File[/opt/puppetlabs/puppet/cache/facts.d]: Failed to generate additional resources using 'eval_generate': SSL_connect returned=1 errno=0 state=error: certificate verify failed: [unable to get certificate CRL for /CN=utility.example.com]
Error: /File[/opt/puppetlabs/puppet/cache/facts.d]: Could not evaluate: Could not retrieve file metadata for puppet:///pluginfacts: SSL_connect returned=1 errno=0 state=error: certificate verify failed: [unable to get certificate CRL for /CN=utility.example.com]
Info: Retrieving plugin
Error: /File[/opt/puppetlabs/puppet/cache/lib]: Failed to generate additional resources using 'eval_generate': SSL_connect returned=1 errno=0 state=error: certificate verify failed: [unable to get certificate CRL for /CN=utility.example.com]
Error: /File[/opt/puppetlabs/puppet/cache/lib]: Could not evaluate: Could not retrieve file metadata for puppet:///plugins: SSL_connect returned=1 errno=0 state=error: certificate verify failed: [unable to get certificate CRL for /CN=utility.example.com]
Info: Loading facts
Error: Could not retrieve catalog from remote server: SSL_connect returned=1 errno=0 state=error: certificate verify failed: [unable to get certificate CRL for /CN=utility.example.com]
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run
Also, the certificate on the puppet agent does not show on the puppet master when I run puppet cert list --all
Warning: `puppet cert` is deprecated and will be removed in a future release.
(location: /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/application.rb:370:in `run')

Since the agent is not issuing a certificate-signing request, it must already have a signed certificate. But it seems not to be a certificate that the master recognizes, therefore the master will not accept it. Possibly the agent does not accept the master's cert, either.
The master refusing service to an unrecognized agent is exactly what one would expect and want if an unauthorized node attempted to retrieve a catalog. The agent refusing to complete a connection to the master is exactly what one would expect and want if an agent's catalog request were delivered to an imposter posing as the master.
But if an authorized agent is having such a problem requesting a catalog from a genuine master that it should recognize, then you have a trust failure. This might happen, for example, if the agent's original master were replaced with a new one, or if Puppet were removed from the master and then re-installed.
If indeed that master has no cert for the agent in question, then you should be able to resolve the issue by shutting down the agent (if it is running as a daemon), then clearing out its certificates so that it generates a new one on its next run. The Puppet docs describe how this can be done (you should need only step 3, "Clear and regenerate certs for Puppet agents", and only for the affected agent).

Related

Airflow SambaHook authentication issue with SpnegoError and Kerberos?

I am trying to connect to a Samba server in Airflow using the SambaHook class. The Samba server requires Kerberos authentication.
I have already defined a Samba connection in Airflow using the following parameters:
Host,Schema and Extra {"auth": "kerberos"}
airflow connections add "samba_repo" --conn-type "samba" --conn-host "myhost.mywork.com" --conn-schema "fld" --conn-extra '{"auth": "kerberos"}'
I'm trying to use the SambaHook class in Airflow to connect to a Samba server. When I run my code, I get the following error:
Failed to authenticate with server: SpnegoError (1): SpnegoError (16): Operation not supported or available, Context: Retrieving NTLM store without NTLM_USER_FILE set to a filepath, Context: Unable to negotiate common mechanism
However, when I use smbclient to connect to the same server using Kerberos authentication from the Docker terminal, it works fine with the command: smbclient //'myhost'/'fld' -c 'ls "\workpath\*" ' -k
What I tried: I set up a connection to the Samba server in Airflow using the SambaHook class and tried to use the listdirmethod to retrieve a list of files in a specific directory.
What I expected to happen: I expected the listdir method to successfully retrieve a list of files in the specified directory from the Samba server.
What actually resulted: Instead, I encountered the following error message:
Failed to authenticate with server: SpnegoError (1): SpnegoError (16): Operation not supported or available, Context: Retrieving NTLM store without NTLM_USER_FILE set to a filepath, Context: Unable to negotiate common mechanism

Issues getting kerberos/Windows AD login work for a web service

I have been struggling with this for quite a while now, and I can't get it to work.
Here is the setup:
I have a nginx webserver serving a django app at mywebapp.k8s.dal1.mycompany.io
It has the SPNEGO plugin compiled in and I have the following endpoint in my config:
location /ad-login {
uwsgi_pass django;
include /usr/lib/mycompany/lib/wsgi/uwsgi_params;
auth_gss on;
auth_gss_realm BURNERDEV1.DAL1.MYCOMPANY.IO;
auth_gss_service_name HTTP/mywebapp.k8s.dal1.mycompany.io;
auth_gss_allow_basic_fallback off;
}
My AD Domain controller is at burnerdev1.dal1.mycompany.io and I have the following users configured:
rep_movsd
portal
I run the following commands on the DC server in an Admin prompt:
ktpass -out krb5.keytab -mapUser portal#BURNERDEV1.DAL1.MYCOMPANY.IO +rndPass -mapOp set +DumpSalt -crypto AES256-SHA1 -ptype KRB5_NT_PRINCIPAL -princ HTTP/mywebapp.k8s.dal1.mycompany.io#BURNERDEV1.DAL1.MYCOMPANY.IO
setspn -A HTTP/mywebapp.k8s.dal1.mycompany.io#BURNERDEV1.DAL1.MYCOMPANY.IO portal
C:\Users\myself\Documents\keytab>ktpass -out krb5.keytab -mapUser portal#BURNERDEV1.DAL1.MYCOMPANY.IO +rndPass -mapOp set +DumpSalt -crypto AES256-SHA1 -ptype KRB5_NT_PRINCIPAL -princ HTTP/mywebapp.k8s.dal1.mycompany.io#BURNERDEV1.DAL1.MYCOMPANY.IO
Targeting domain controller: dal1devdc1.burnerdev1.dal1.mycompany.io
Using legacy password setting method
Failed to set property 'servicePrincipalName' to 'HTTP/mywebapp.k8s.dal1.mycompany.io' on Dn 'CN=portal,CN=Users,DC=burnerdev1,DC=dal1,
DC=mycompany,DC=io': 0x13.
WARNING: Unable to set SPN mapping data.
If portal already has an SPN mapping installed for HTTP/mywebapp.k8s.dal1.mycompany.io, this is no cause for concern.
Building salt with principalname HTTP/mywebapp.k8s.dal1.mycompany.io and domain BURNERDEV1.DAL1.MYCOMPANY.IO (encryption type 18)...
Hashing password with salt "BURNERDEV1.DAL1.MYCOMPANY.IOHTTPmywebapp.k8s.dal1.mycompany.io".
Key created.
Output keytab to krb5.keytab:
Keytab version: 0x502
keysize 110 HTTP/mywebapp.k8s.dal1.mycompany.io#BURNERDEV1.DAL1.MYCOMPANY.IO ptype 1 (KRB5_NT_PRINCIPAL) vno 3 etype 0x12 (AES256-SHA1) k
eylength 32 (0x632d9ca3356374e9de490ec2f7718f9fb652b20da40bd212a808db4c46a72bc5)
C:\Users\myself\Documents\keytab>setspn -A HTTP/mywebapp.k8s.dal1.mycompany.io#BURNERDEV1.DAL1.MYCOMPANY.IO portal
Checking domain DC=burnerdev1,DC=dal1,DC=mycompany,DC=io
Registering ServicePrincipalNames for CN=portal,CN=Users,DC=burnerdev1,DC=dal1,DC=mycompany,DC=io
HTTP/mywebapp.k8s.dal1.mycompany.io#BURNERDEV1.DAL1.MYCOMPANY.IO
Updated object
C:\Users\myself\Documents\keytab>
Now in the "Active Directory Users and Computers" section, I rightclicked the user and selected "Properties"
Then on the "Delegation" tab I set "Trust this user for delegation to any service (Kerberos only)"
Next I copy the krb5.keytab file to my webserver and restart the nginx container
On the Windows workstation which is part of the domain, I log on as rep_movsd - when I run klist:
C:\Users\rep_movsd>klist
Current LogonId is 0:0x208d7
Cached Tickets: (2)
#0> Client: rep_movsd # BURNERDEV1.DAL1.MYCOMPANY.IO
Server: krbtgt/BURNERDEV1.DAL1.MYCOMPANY.IO # BURNERDEV1.DAL1.MYCOMPANY.IO
KerbTicket Encryption Type: AES-256-CTS-HMAC-SHA1-96
Ticket Flags 0x40e10000 -> forwardable renewable initial pre_authent name_canonicalize
Start Time: 7/16/2020 2:05:51 (local)
End Time: 7/16/2020 12:05:51 (local)
Renew Time: 7/23/2020 2:05:51 (local)
Session Key Type: AES-256-CTS-HMAC-SHA1-96
#1> Client: rep_movsd # BURNERDEV1.DAL1.MYCOMPANY.IO
Server: HTTP/mywebapp.k8s.dal1.mycompany.io # BURNERDEV1.DAL1.MYCOMPANY.IO
KerbTicket Encryption Type: AES-256-CTS-HMAC-SHA1-96
Ticket Flags 0x40a10000 -> forwardable renewable pre_authent name_canonicalize
Start Time: 7/16/2020 2:06:01 (local)
End Time: 7/16/2020 12:05:51 (local)
Renew Time: 7/23/2020 2:05:51 (local)
Session Key Type: AES-256-CTS-HMAC-SHA1-96
I setup Firefox to do SPENGO authentication
Then I hit mywebapp.k8s.dal1.mycompany.io/ad-login and I get a 403 Forbidden error
The nginx server debug log shows:
[debug] 16#16: *195 Client sent a reasonable Negotiate header
[debug] 16#16: *195 GSSAPI authorizing
[debug] 16#16: *195 Use keytab /etc/krb5.keytab
[debug] 16#16: *195 Using service principal: HTTP/mywebapp.k8s.dal1.mycompany.io#BURNERDEV1.DAL1.MYCOMPANY.IO
[debug] 16#16: *195 my_gss_name HTTP/mywebapp.k8s.dal1.mycompany.io#BURNERDEV1.DAL1.MYCOMPANY.IO
[debug] 16#16: *195 gss_accept_sec_context() failed: Cannot decrypt ticket for HTTP/mywebapp.k8s.dal1.mycompany.io#BURNERDEV1.DAL1.MYCOMPANY.IO using keytab key for HTTP/mywebapp.k8s.dal1.mycompany.io#BURNERDEV1.DAL1.MYCOMPANY.IO:
[debug] 16#16: *195 GSSAPI failed
[debug] 16#16: *195 http finalize request: 403, "/ad-login?" a:1, c:1
[debug] 16#16: *195 http special response: 403, "/ad-login?"
[debug] 16#16: *195 http set discard body
[debug] 16#16: *195 charset: "" > "utf-8"
[debug] 16#16: *195 HTTP/1.1 403 Forbidden
BTW while messing around earlier - I found that if I had set a fixed password for the "portal" user with ktpass and logged in as that account on the workstation, the login would succeed.
I was under the mistaken impression that I'd need to create a new keytab for every user and combine all of them.
Any help is greatly appreciated - I read so many conflicting docs its only confused me further and I've been losing sleep over this.
Thanks in advance!
I've read your problem statement carefully, and I think if you follow the steps I wrote below the issue will be solved.
On the DC server where you are creating the keytab, (1) UAC must be temporarily disabled. (2) The user creating the keytab must be a member of the Domain Admins group.
Ensure the SPN is not a duplicate, then remove the SPN from the Active Directory user account portal. This must be done before creating a new keytab using the same SPN against the same account. The below command is a one-liner, word-wrapping makes it look like two lines.
setspn -d HTTP/mywebapp.k8s.dal1.mycompany.io#BURNERDEV1.DAL1.MYCOMPANY.IO portal
Re-create the keytab again exactly as you did before.
You do not need to run the command setspn -A HTTP/mywebapp.k8s.dal1.mycompany.io#BURNERDEV1.DAL1.MYCOMPANY.IO portal because SPN was already set on the Active Directory user account by the ktpass command in step 3.
Replace the old keytab with the new keytab.
Restart the nginx webserver service.
Clear browser cache AND clear Kerberos case (klist purge).
Try it again.
You must do all these steps including the final step 7. Do not skip any.
You service account is named portal. A hash of this password is stored in both Active Directory and the keytab. Same hash is in both locations. The keytab on the nginix server is utilized to decrypt the inbound Kerberos service tickets to determine who the user is attempting to access the web app. More specifically, the GSS authentication does all the work, it uses the keytab to un-scramble the encrypted service tickets. The user rep_movsd does not have the service account credentials. It is part of the Active Directory domain, and when accessing the nginix web server, it gets it's own Kerberos service ticket and its identity is proven to the web server by simply being in possession of a service ticket that is decrypted by the keytab. If it wasn't part of the BURNERDEV1.DAL1.MYCOMPANY.IO domain, or had an expired password, or was a disabled account, it would not be able to get a service ticket and thus not prove its identity and fail authentication.
If you have time, please see my TechNet Wiki article on keytab creation and the logic behind it to help you better understand this complex subject.

How can CSR polling be restarted when joining a Corda Network?

When joining the UAT Corda Network and running initial registration as required the Corda node was shutdown before the CSR completed. https://uat.network.r3.com/pages/joining/joining.html
The CSR has been approved by the Corda node does not have the correct certificates yet. When trying to start the node it throws an exception for missing certificates.
[ERROR] 2019-07-26T16:47:51,099Z [main] internal.NodeStartupLogging.invoke - Exception during node startup: One or more keyStores (identity or TLS) or trustStore not found.
Please either copy your existing keys and certificates from another node, or if you don't have one yet, fill out the config file and run corda.jar initial-registration.
Read more at: https://docs.corda.net/permissioning.html [errorCode=16fn52g, moreInformationAt=https://errors.corda.net/ENT/4.1/16fn52g] {}
java.lang.IllegalArgumentException: One or more keyStores (identity or TLS) or trustStore not found. Please either copy your existing keys and certificates from another node, or if you don't have one yet, fill out the config file and run corda.jar initial-registration.
Read more at: https://docs.corda.net/permissioning.html
How can the CSR polling be completed?
Initial registration can be rerun and will resume polling based on the CSR id that is located in the certificates directory as certificate-request-id.txt. Rerun the same command used to start the CSR.
java -jar <CORDA JAR FILE> –initial-registration –network-root-truststore-password <TRUST STORE PASSWORD>

Issue with connecting Golang application on Cloud Run with Firestore

I try to get all Documents from Firestore using the below function.
The credentials are stored in an encrypted file in a GCP Cloud Source repository.
I decrypted the configuration in the Cloud Build trigger and set the ENV in the Dockerfile pointing to the file. I see the content by RUN ls /app/credentials.json.
The error I get in the application log:
rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: authentication handshake failed: x509: certificate signed by unknown authority"
The credentials are stored in an encrypted file in a GCP Cloud Source repository.
I decrypted the configuration in the Cloud Build trigger and set the ENV in the Dockerfile pointing to the file. I see the content by RUN ls /app/credentials.json.
The error I get in the application log:
rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: authentication handshake failed: x509: certificate signed by unknown authority"
This error is the result of an HTTPS failure where the certificate cannot be verified. The Alpine base image is missing a package that provides root certificates. Currently the Cloud Run quickstart is missing this for at least the Go language.
Assuming this is your problem, add the following to the final stage of your Dockerfile:
RUN apk add --no-cache ca-certificates

After changing paswords in vault.yml, deployment fails in trellis

I had a wordpress site setup using Trellis. Initially I had set up the server and deployed without encrypting the vault.yml.
Once everything was working fine I changed the passwords in vault.yml and encrypted the file. But my deployment fails now.
And I get the following error-
TASK [deploy : WordPress Installed?]
**************************
System info:
Ansible 2.6.3; Darwin
Trellis version (per changelog): "Allow customizing Nginx `worker_connections`"
---------------------------------------------------
non-zero return code
Error: Error establishing a database connection. This either means that
the username and password information in your `wp-config.php` file is
incorrect or we can’t contact the database server at `localhost`. This
could mean your host’s database server is down.
fatal: [mysite.org]: FAILED! => {"changed": false,
"cmd": ["wp", "core", "is-installed", "--skip-plugins", "--skip-
themes", "--require=/srv/www/mysite.org/shared/tmp_multisite_constants.php"], "delta":
"0:00:00.224955", "end": "2019-01-04 16:59:01.531111",
"failed_when_result": true, "rc": 1, "start": "2019-01-04
16:59:01.306156", "stderr_lines": ["Error: Error establishing a
database connection. This either means that the username and password
information in your `wp-config.php` file is incorrect or we can’t
contact the database server at `localhost`. This could mean your host’s
database server is down."], "stdout": "", "stdout_lines": []}
to retry, use: --limit
#/Users/praneethavelamuri/Desktop/path/to/my/project/trellis/deploy.retry
Is there any step I missed? I followed these steps-
ansible-playbook server.yml -e env=staging
./bin/deploy.sh staging mysite.org
change passwords in staging/vault.yml
set vault password
inform ansible about password
encrypt the file
commit the file and push the repo
re deploy and then I get the error!
I got it solved. I have changed the sudo user password too in my vault. so ssh into server and changing sudo password to the password mentioned in vault and then provisioning it and then deploying solved the issue.

Resources